From patchwork Thu Jan 28 09:04:46 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053209
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 4B813C433DB
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:18 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id D415964DD6
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:17 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231259AbhA1JH5 (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:07:57 -0500
Received: from relayfre-01.paragon-software.com ([176.12.100.13]:47782 "EHLO
        relayfre-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231766AbhA1JGK (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:10 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relayfre-01.paragon-software.com (Postfix) with ESMTPS id
 9077E1D7C;
        Thu, 28 Jan 2021 12:05:02 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824702;
        bh=/Ll+fAeRkLUfDElu8ZCtygU2fGro9EVUROQ7gP1eDeM=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=LF/fOfUop4eO3krTnUud23YzWQ+jTl5d4rzJxVpo4Hc0WKKt3nBpy7x0ODvICYuJi
         hQJ7rhRr4esVfhMWZ6mfCr0a5vzwxlvqPJWTNl8AUc5PfwHCQOK/scvzphQtIgt6O/
         +x0o8qTQif326xWthgqDt21zxn+gh9zL4w1IBpxo=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:01 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 01/10] fs/ntfs3: Add headers and misc files
Date: Thu, 28 Jan 2021 12:04:46 +0300
Message-ID: 
 <20210128090455.3576502-2-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds headers and misc files

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/ntfs3/debug.h   |   62 +++
 fs/ntfs3/ntfs.h    | 1238 ++++++++++++++++++++++++++++++++++++++++++++
 fs/ntfs3/ntfs_fs.h | 1065 +++++++++++++++++++++++++++++++++++++
 fs/ntfs3/upcase.c  |  100 ++++
 4 files changed, 2465 insertions(+)
 create mode 100644 fs/ntfs3/debug.h
 create mode 100644 fs/ntfs3/ntfs.h
 create mode 100644 fs/ntfs3/ntfs_fs.h
 create mode 100644 fs/ntfs3/upcase.c

diff --git a/fs/ntfs3/debug.h b/fs/ntfs3/debug.h
new file mode 100644
index 000000000000..41893421081a
--- /dev/null
+++ b/fs/ntfs3/debug.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ * useful functions for debuging
+ */
+
+// clang-format off
+#ifndef Add2Ptr
+#define Add2Ptr(P, I)		((void *)((u8 *)(P) + (I)))
+#define PtrOffset(B, O)		((size_t)((size_t)(O) - (size_t)(B)))
+#endif
+
+#define QuadAlign(n)		(((n) + 7u) & (~7u))
+#define IsQuadAligned(n)	(!((size_t)(n)&7u))
+#define Quad2Align(n)		(((n) + 15u) & (~15u))
+#define IsQuad2Aligned(n)	(!((size_t)(n)&15u))
+#define Quad4Align(n)		(((n) + 31u) & (~31u))
+#define IsSizeTAligned(n)	(!((size_t)(n) & (sizeof(size_t) - 1)))
+#define DwordAlign(n)		(((n) + 3u) & (~3u))
+#define IsDwordAligned(n)	(!((size_t)(n)&3u))
+#define WordAlign(n)		(((n) + 1u) & (~1u))
+#define IsWordAligned(n)	(!((size_t)(n)&1u))
+
+#ifdef CONFIG_PRINTK
+__printf(2, 3)
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...);
+__printf(2, 3)
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...);
+#else
+static inline __printf(2, 3)
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...)
+{
+}
+
+static inline __printf(2, 3)
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...)
+{
+}
+#endif
+
+/*
+ * Logging macros ( thanks Joe Perches <joe@perches.com> for implementation )
+ */
+
+#define ntfs_err(sb, fmt, ...)  ntfs_printk(sb, KERN_ERR fmt, ##__VA_ARGS__)
+#define ntfs_warn(sb, fmt, ...) ntfs_printk(sb, KERN_WARNING fmt, ##__VA_ARGS__)
+#define ntfs_info(sb, fmt, ...) ntfs_printk(sb, KERN_INFO fmt, ##__VA_ARGS__)
+#define ntfs_notice(sb, fmt, ...)                                              \
+	ntfs_printk(sb, KERN_NOTICE fmt, ##__VA_ARGS__)
+
+#define ntfs_inode_err(inode, fmt, ...)                                        \
+	ntfs_inode_printk(inode, KERN_ERR fmt, ##__VA_ARGS__)
+#define ntfs_inode_warn(inode, fmt, ...)                                       \
+	ntfs_inode_printk(inode, KERN_WARNING fmt, ##__VA_ARGS__)
+
+#define ntfs_malloc(s)		kmalloc(s, GFP_NOFS)
+#define ntfs_zalloc(s)		kzalloc(s, GFP_NOFS)
+#define ntfs_free(p)		kfree(p)
+#define ntfs_memdup(src, len)	kmemdup(src, len, GFP_NOFS)
+// clang-format on
diff --git a/fs/ntfs3/ntfs.h b/fs/ntfs3/ntfs.h
new file mode 100644
index 000000000000..19aac05ff3d1
--- /dev/null
+++ b/fs/ntfs3/ntfs.h
@@ -0,0 +1,1238 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ * on-disk ntfs structs
+ */
+
+// clang-format off
+
+/* TODO:
+ * - Check 4K mft record and 512 bytes cluster
+ */
+
+/*
+ * Activate this define to use binary search in indexes
+ */
+#define NTFS3_INDEX_BINARY_SEARCH
+
+/*
+ * Check each run for marked clusters
+ */
+#define NTFS3_CHECK_FREE_CLST
+
+#define NTFS_NAME_LEN 255
+
+/*
+ * ntfs.sys used 500 maximum links
+ * on-disk struct allows up to 0xffff
+ */
+#define NTFS_LINK_MAX 0x400
+//#define NTFS_LINK_MAX 0xffff
+
+/*
+ * Activate to use 64 bit clusters instead of 32 bits in ntfs.sys
+ * Logical and virtual cluster number
+ * If needed, may be redefined to use 64 bit value
+ */
+//#define NTFS3_64BIT_CLUSTER
+
+#define NTFS_LZNT_MAX_CLUSTER	4096
+#define NTFS_LZNT_CUNIT		4
+#define NTFS_LZNT_CLUSTERS	(1u<<NTFS_LZNT_CUNIT)
+
+struct GUID {
+	__le32 Data1;
+	__le16 Data2;
+	__le16 Data3;
+	u8 Data4[8];
+};
+
+/*
+ * this struct repeats layout of ATTR_FILE_NAME
+ * at offset 0x40
+ * it used to store global constants NAME_MFT/NAME_MIRROR...
+ * most constant names are shorter than 10
+ */
+struct cpu_str {
+	u8 len;
+	u8 unused;
+	u16 name[10];
+};
+
+struct le_str {
+	u8 len;
+	u8 unused;
+	__le16 name[1];
+};
+
+static_assert(SECTOR_SHIFT == 9);
+
+#ifdef NTFS3_64BIT_CLUSTER
+typedef u64 CLST;
+static_assert(sizeof(size_t) == 8);
+#else
+typedef u32 CLST;
+#endif
+
+#define SPARSE_LCN64   ((u64)-1)
+#define SPARSE_LCN     ((CLST)-1)
+#define RESIDENT_LCN   ((CLST)-2)
+#define COMPRESSED_LCN ((CLST)-3)
+
+#define COMPRESSION_UNIT     4
+#define COMPRESS_MAX_CLUSTER 0x1000
+#define MFT_INCREASE_CHUNK   1024
+
+enum RECORD_NUM {
+	MFT_REC_MFT		= 0,
+	MFT_REC_MIRR		= 1,
+	MFT_REC_LOG		= 2,
+	MFT_REC_VOL		= 3,
+	MFT_REC_ATTR		= 4,
+	MFT_REC_ROOT		= 5,
+	MFT_REC_BITMAP		= 6,
+	MFT_REC_BOOT		= 7,
+	MFT_REC_BADCLUST	= 8,
+	//MFT_REC_QUOTA		= 9,
+	MFT_REC_SECURE		= 9, // NTFS 3.0
+	MFT_REC_UPCASE		= 10,
+	MFT_REC_EXTEND		= 11, // NTFS 3.0
+	MFT_REC_RESERVED	= 11,
+	MFT_REC_FREE		= 16,
+	MFT_REC_USER		= 24,
+};
+
+enum ATTR_TYPE {
+	ATTR_ZERO		= cpu_to_le32(0x00),
+	ATTR_STD		= cpu_to_le32(0x10),
+	ATTR_LIST		= cpu_to_le32(0x20),
+	ATTR_NAME		= cpu_to_le32(0x30),
+	// ATTR_VOLUME_VERSION on Nt4
+	ATTR_ID			= cpu_to_le32(0x40),
+	ATTR_SECURE		= cpu_to_le32(0x50),
+	ATTR_LABEL		= cpu_to_le32(0x60),
+	ATTR_VOL_INFO		= cpu_to_le32(0x70),
+	ATTR_DATA		= cpu_to_le32(0x80),
+	ATTR_ROOT		= cpu_to_le32(0x90),
+	ATTR_ALLOC		= cpu_to_le32(0xA0),
+	ATTR_BITMAP		= cpu_to_le32(0xB0),
+	// ATTR_SYMLINK on Nt4
+	ATTR_REPARSE		= cpu_to_le32(0xC0),
+	ATTR_EA_INFO		= cpu_to_le32(0xD0),
+	ATTR_EA			= cpu_to_le32(0xE0),
+	ATTR_PROPERTYSET	= cpu_to_le32(0xF0),
+	ATTR_LOGGED_UTILITY_STREAM = cpu_to_le32(0x100),
+	ATTR_END		= cpu_to_le32(0xFFFFFFFF)
+};
+
+static_assert(sizeof(enum ATTR_TYPE) == 4);
+
+enum FILE_ATTRIBUTE {
+	FILE_ATTRIBUTE_READONLY		= cpu_to_le32(0x00000001),
+	FILE_ATTRIBUTE_HIDDEN		= cpu_to_le32(0x00000002),
+	FILE_ATTRIBUTE_SYSTEM		= cpu_to_le32(0x00000004),
+	FILE_ATTRIBUTE_ARCHIVE		= cpu_to_le32(0x00000020),
+	FILE_ATTRIBUTE_DEVICE		= cpu_to_le32(0x00000040),
+	FILE_ATTRIBUTE_TEMPORARY	= cpu_to_le32(0x00000100),
+	FILE_ATTRIBUTE_SPARSE_FILE	= cpu_to_le32(0x00000200),
+	FILE_ATTRIBUTE_REPARSE_POINT	= cpu_to_le32(0x00000400),
+	FILE_ATTRIBUTE_COMPRESSED	= cpu_to_le32(0x00000800),
+	FILE_ATTRIBUTE_OFFLINE		= cpu_to_le32(0x00001000),
+	FILE_ATTRIBUTE_NOT_CONTENT_INDEXED = cpu_to_le32(0x00002000),
+	FILE_ATTRIBUTE_ENCRYPTED	= cpu_to_le32(0x00004000),
+	FILE_ATTRIBUTE_VALID_FLAGS	= cpu_to_le32(0x00007fb7),
+	FILE_ATTRIBUTE_DIRECTORY	= cpu_to_le32(0x10000000),
+};
+
+static_assert(sizeof(enum FILE_ATTRIBUTE) == 4);
+
+extern const struct cpu_str NAME_MFT;
+extern const struct cpu_str NAME_MIRROR;
+extern const struct cpu_str NAME_LOGFILE;
+extern const struct cpu_str NAME_VOLUME;
+extern const struct cpu_str NAME_ATTRDEF;
+extern const struct cpu_str NAME_ROOT;
+extern const struct cpu_str NAME_BITMAP;
+extern const struct cpu_str NAME_BOOT;
+extern const struct cpu_str NAME_BADCLUS;
+extern const struct cpu_str NAME_QUOTA;
+extern const struct cpu_str NAME_SECURE;
+extern const struct cpu_str NAME_UPCASE;
+extern const struct cpu_str NAME_EXTEND;
+extern const struct cpu_str NAME_OBJID;
+extern const struct cpu_str NAME_REPARSE;
+extern const struct cpu_str NAME_USNJRNL;
+
+extern const __le16 I30_NAME[4];
+extern const __le16 SII_NAME[4];
+extern const __le16 SDH_NAME[4];
+extern const __le16 SO_NAME[2];
+extern const __le16 SQ_NAME[2];
+extern const __le16 SR_NAME[2];
+
+extern const __le16 BAD_NAME[4];
+extern const __le16 SDS_NAME[4];
+extern const __le16 WOF_NAME[17];	/* WofCompressedData */
+
+/* MFT record number structure */
+struct MFT_REF {
+	__le32 low;	// The low part of the number
+	__le16 high;	// The high part of the number
+	__le16 seq;	// The sequence number of MFT record
+};
+
+static_assert(sizeof(__le64) == sizeof(struct MFT_REF));
+
+static inline CLST ino_get(const struct MFT_REF *ref)
+{
+#ifdef NTFS3_64BIT_CLUSTER
+	return le32_to_cpu(ref->low) | ((u64)le16_to_cpu(ref->high) << 32);
+#else
+	return le32_to_cpu(ref->low);
+#endif
+}
+
+struct NTFS_BOOT {
+	u8 jump_code[3];	// 0x00: Jump to boot code
+	u8 system_id[8];	// 0x03: System ID, equals "NTFS    "
+
+	// NOTE: this member is not aligned(!)
+	// bytes_per_sector[0] must be 0
+	// bytes_per_sector[1] must be multiplied by 256
+	u8 bytes_per_sector[2];	// 0x0B: Bytes per sector
+
+	u8 sectors_per_clusters;// 0x0D: Sectors per cluster
+	u8 unused1[7];
+	u8 media_type;		// 0x15: Media type (0xF8 - harddisk)
+	u8 unused2[2];
+	__le16 sct_per_track;	// 0x18: number of sectors per track
+	__le16 heads;		// 0x1A: number of heads per cylinder
+	__le32 hidden_sectors;	// 0x1C: number of 'hidden' sectors
+	u8 unused3[4];
+	u8 bios_drive_num;	// 0x24: BIOS drive number =0x80
+	u8 unused4;
+	u8 signature_ex;	// 0x26: Extended BOOT signature =0x80
+	u8 unused5;
+	__le64 sectors_per_volume;// 0x28: size of volume in sectors
+	__le64 mft_clst;	// 0x30: first cluster of $MFT
+	__le64 mft2_clst;	// 0x38: first cluster of $MFTMirr
+	s8 record_size;		// 0x40: size of MFT record in clusters(sectors)
+	u8 unused6[3];
+	s8 index_size;		// 0x44: size of INDX record in clusters(sectors)
+	u8 unused7[3];
+	__le64 serial_num;	// 0x48: Volume serial number
+	__le32 check_sum;	// 0x50: Simple additive checksum of all
+				// of the u32's which precede the 'check_sum'
+
+	u8 boot_code[0x200 - 0x50 - 2 - 4]; // 0x54:
+	u8 boot_magic[2];	// 0x1FE: Boot signature =0x55 + 0xAA
+};
+
+static_assert(sizeof(struct NTFS_BOOT) == 0x200);
+
+enum NTFS_SIGNATURE {
+	NTFS_FILE_SIGNATURE = cpu_to_le32(0x454C4946), // 'FILE'
+	NTFS_INDX_SIGNATURE = cpu_to_le32(0x58444E49), // 'INDX'
+	NTFS_CHKD_SIGNATURE = cpu_to_le32(0x444B4843), // 'CHKD'
+	NTFS_RSTR_SIGNATURE = cpu_to_le32(0x52545352), // 'RSTR'
+	NTFS_RCRD_SIGNATURE = cpu_to_le32(0x44524352), // 'RCRD'
+	NTFS_BAAD_SIGNATURE = cpu_to_le32(0x44414142), // 'BAAD'
+	NTFS_HOLE_SIGNATURE = cpu_to_le32(0x454C4F48), // 'HOLE'
+	NTFS_FFFF_SIGNATURE = cpu_to_le32(0xffffffff),
+};
+
+static_assert(sizeof(enum NTFS_SIGNATURE) == 4);
+
+/* MFT Record header structure */
+struct NTFS_RECORD_HEADER {
+	/* Record magic number, equals 'FILE'/'INDX'/'RSTR'/'RCRD' */
+	enum NTFS_SIGNATURE sign; // 0x00:
+	__le16 fix_off;		// 0x04:
+	__le16 fix_num;		// 0x06:
+	__le64 lsn;		// 0x08: Log file sequence number
+};
+
+static_assert(sizeof(struct NTFS_RECORD_HEADER) == 0x10);
+
+static inline int is_baad(const struct NTFS_RECORD_HEADER *hdr)
+{
+	return hdr->sign == NTFS_BAAD_SIGNATURE;
+}
+
+/* Possible bits in struct MFT_REC.flags */
+enum RECORD_FLAG {
+	RECORD_FLAG_IN_USE	= cpu_to_le16(0x0001),
+	RECORD_FLAG_DIR		= cpu_to_le16(0x0002),
+	RECORD_FLAG_SYSTEM	= cpu_to_le16(0x0004),
+	RECORD_FLAG_UNKNOWN	= cpu_to_le16(0x0008),
+};
+
+/* MFT Record structure */
+struct MFT_REC {
+	struct NTFS_RECORD_HEADER rhdr; // 'FILE'
+
+	__le16 seq;		// 0x10: Sequence number for this record
+	__le16 hard_links;	// 0x12: The number of hard links to record
+	__le16 attr_off;	// 0x14: Offset to attributes
+	__le16 flags;		// 0x16: See RECORD_FLAG
+	__le32 used;		// 0x18: The size of used part
+	__le32 total;		// 0x1C: Total record size
+
+	struct MFT_REF parent_ref; // 0x20: Parent MFT record
+	__le16 next_attr_id;	// 0x28: The next attribute Id
+
+	__le16 res;		// 0x2A: High part of mft record?
+	__le32 mft_record;	// 0x2C: Current mft record number
+	__le16 fixups[1];	// 0x30:
+};
+
+#define MFTRECORD_FIXUP_OFFSET_1 offsetof(struct MFT_REC, res)
+#define MFTRECORD_FIXUP_OFFSET_3 offsetof(struct MFT_REC, fixups)
+
+static_assert(MFTRECORD_FIXUP_OFFSET_1 == 0x2A);
+static_assert(MFTRECORD_FIXUP_OFFSET_3 == 0x30);
+
+static inline bool is_rec_base(const struct MFT_REC *rec)
+{
+	const struct MFT_REF *r = &rec->parent_ref;
+
+	return !r->low && !r->high && !r->seq;
+}
+
+static inline bool is_mft_rec5(const struct MFT_REC *rec)
+{
+	return le16_to_cpu(rec->rhdr.fix_off) >=
+	       offsetof(struct MFT_REC, fixups);
+}
+
+static inline bool is_rec_inuse(const struct MFT_REC *rec)
+{
+	return rec->flags & RECORD_FLAG_IN_USE;
+}
+
+static inline bool clear_rec_inuse(struct MFT_REC *rec)
+{
+	return rec->flags &= ~RECORD_FLAG_IN_USE;
+}
+
+/* Possible values of ATTR_RESIDENT.flags */
+#define RESIDENT_FLAG_INDEXED 0x01
+
+struct ATTR_RESIDENT {
+	__le32 data_size;	// 0x10: The size of data
+	__le16 data_off;	// 0x14: Offset to data
+	u8 flags;		// 0x16: resident flags ( 1 - indexed )
+	u8 res;			// 0x17:
+}; // sizeof() = 0x18
+
+struct ATTR_NONRESIDENT {
+	__le64 svcn;		// 0x10: Starting VCN of this segment
+	__le64 evcn;		// 0x18: End VCN of this segment
+	__le16 run_off;		// 0x20: Offset to packed runs
+	//  Unit of Compression size for this stream, expressed
+	//  as a log of the cluster size.
+	//
+	//	0 means file is not compressed
+	//	1, 2, 3, and 4 are potentially legal values if the
+	//	    stream is compressed, however the implementation
+	//	    may only choose to use 4, or possibly 3.  Note
+	//	    that 4 means cluster size time 16.	If convenient
+	//	    the implementation may wish to accept a
+	//	    reasonable range of legal values here (1-5?),
+	//	    even if the implementation only generates
+	//	    a smaller set of values itself.
+	u8 c_unit;		// 0x22
+	u8 res1[5];		// 0x23:
+	__le64 alloc_size;	// 0x28: The allocated size of attribute in bytes
+				// (multiple of cluster size)
+	__le64 data_size;	// 0x30: The size of attribute  in bytes <= alloc_size
+	__le64 valid_size;	// 0x38: The size of valid part in bytes <= data_size
+	__le64 total_size;	// 0x40: The sum of the allocated clusters for a file
+				// (present only for the first segment (0 == vcn)
+				// of compressed attribute)
+
+}; // sizeof()=0x40 or 0x48 (if compressed)
+
+/* Possible values of ATTRIB.flags: */
+#define ATTR_FLAG_COMPRESSED	  cpu_to_le16(0x0001)
+#define ATTR_FLAG_COMPRESSED_MASK cpu_to_le16(0x00FF)
+#define ATTR_FLAG_ENCRYPTED	  cpu_to_le16(0x4000)
+#define ATTR_FLAG_SPARSED	  cpu_to_le16(0x8000)
+
+struct ATTRIB {
+	enum ATTR_TYPE type;	// 0x00: The type of this attribute
+	__le32 size;		// 0x04: The size of this attribute
+	u8 non_res;		// 0x08: Is this attribute non-resident ?
+	u8 name_len;		// 0x09: This attribute name length
+	__le16 name_off;	// 0x0A: Offset to the attribute name
+	__le16 flags;		// 0x0C: See ATTR_FLAG_XXX
+	__le16 id;		// 0x0E: unique id (per record)
+
+	union {
+		struct ATTR_RESIDENT res;     // 0x10
+		struct ATTR_NONRESIDENT nres; // 0x10
+	};
+};
+
+/* Define attribute sizes */
+#define SIZEOF_RESIDENT			0x18
+#define SIZEOF_NONRESIDENT_EX		0x48
+#define SIZEOF_NONRESIDENT		0x40
+
+#define SIZEOF_RESIDENT_LE		cpu_to_le16(0x18)
+#define SIZEOF_NONRESIDENT_EX_LE	cpu_to_le16(0x48)
+#define SIZEOF_NONRESIDENT_LE		cpu_to_le16(0x40)
+
+static inline u64 attr_ondisk_size(const struct ATTRIB *attr)
+{
+	return attr->non_res ? ((attr->flags &
+				 (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED)) ?
+					le64_to_cpu(attr->nres.total_size) :
+					le64_to_cpu(attr->nres.alloc_size)) :
+			       QuadAlign(le32_to_cpu(attr->res.data_size));
+}
+
+static inline u64 attr_size(const struct ATTRIB *attr)
+{
+	return attr->non_res ? le64_to_cpu(attr->nres.data_size) :
+			       le32_to_cpu(attr->res.data_size);
+}
+
+static inline bool is_attr_encrypted(const struct ATTRIB *attr)
+{
+	return attr->flags & ATTR_FLAG_ENCRYPTED;
+}
+
+static inline bool is_attr_sparsed(const struct ATTRIB *attr)
+{
+	return attr->flags & ATTR_FLAG_SPARSED;
+}
+
+static inline bool is_attr_compressed(const struct ATTRIB *attr)
+{
+	return attr->flags & ATTR_FLAG_COMPRESSED;
+}
+
+static inline bool is_attr_ext(const struct ATTRIB *attr)
+{
+	return attr->flags & (ATTR_FLAG_SPARSED | ATTR_FLAG_COMPRESSED);
+}
+
+static inline bool is_attr_indexed(const struct ATTRIB *attr)
+{
+	return !attr->non_res && (attr->res.flags & RESIDENT_FLAG_INDEXED);
+}
+
+static const inline __le16 *attr_name(const struct ATTRIB *attr)
+{
+	return Add2Ptr(attr, le16_to_cpu(attr->name_off));
+}
+
+static inline u64 attr_svcn(const struct ATTRIB *attr)
+{
+	return attr->non_res ? le64_to_cpu(attr->nres.svcn) : 0;
+}
+
+/* the size of resident attribute by its resident size */
+#define BYTES_PER_RESIDENT(b) (0x18 + (b))
+
+static_assert(sizeof(struct ATTRIB) == 0x48);
+static_assert(sizeof(((struct ATTRIB *)NULL)->res) == 0x08);
+static_assert(sizeof(((struct ATTRIB *)NULL)->nres) == 0x38);
+
+static inline void *resident_data_ex(const struct ATTRIB *attr, u32 datasize)
+{
+	u32 asize, rsize;
+	u16 off;
+
+	if (attr->non_res)
+		return NULL;
+
+	asize = le32_to_cpu(attr->size);
+	off = le16_to_cpu(attr->res.data_off);
+
+	if (asize < datasize + off)
+		return NULL;
+
+	rsize = le32_to_cpu(attr->res.data_size);
+	if (rsize < datasize)
+		return NULL;
+
+	return Add2Ptr(attr, off);
+}
+
+static inline void *resident_data(const struct ATTRIB *attr)
+{
+	return Add2Ptr(attr, le16_to_cpu(attr->res.data_off));
+}
+
+static inline void *attr_run(const struct ATTRIB *attr)
+{
+	return Add2Ptr(attr, le16_to_cpu(attr->nres.run_off));
+}
+
+/* Standard information attribute (0x10) */
+struct ATTR_STD_INFO {
+	__le64 cr_time;		// 0x00: File creation file
+	__le64 m_time;		// 0x08: File modification time
+	__le64 c_time;		// 0x10: Last time any attribute was modified
+	__le64 a_time;		// 0x18: File last access time
+	enum FILE_ATTRIBUTE fa; // 0x20: Standard DOS attributes & more
+	__le32 max_ver_num;	// 0x24: Maximum Number of Versions
+	__le32 ver_num;		// 0x28: Version Number
+	__le32 class_id;	// 0x2C: Class Id from bidirectional Class Id index
+};
+
+static_assert(sizeof(struct ATTR_STD_INFO) == 0x30);
+
+#define SECURITY_ID_INVALID 0x00000000
+#define SECURITY_ID_FIRST 0x00000100
+
+struct ATTR_STD_INFO5 {
+	__le64 cr_time;		// 0x00: File creation file
+	__le64 m_time;		// 0x08: File modification time
+	__le64 c_time;		// 0x10: Last time any attribute was modified
+	__le64 a_time;		// 0x18: File last access time
+	enum FILE_ATTRIBUTE fa; // 0x20: Standard DOS attributes & more
+	__le32 max_ver_num;	// 0x24: Maximum Number of Versions
+	__le32 ver_num;		// 0x28: Version Number
+	__le32 class_id;	// 0x2C: Class Id from bidirectional Class Id index
+
+	__le32 owner_id;	// 0x30: Owner Id of the user owning the file.
+	__le32 security_id;	// 0x34: The Security Id is a key in the $SII Index and $SDS
+	__le64 quota_charge;	// 0x38:
+	__le64 usn;		// 0x40: Last Update Sequence Number of the file. This is a direct
+				// index into the file $UsnJrnl. If zero, the USN Journal is
+				// disabled.
+};
+
+static_assert(sizeof(struct ATTR_STD_INFO5) == 0x48);
+
+/* attribute list entry structure (0x20) */
+struct ATTR_LIST_ENTRY {
+	enum ATTR_TYPE type;	// 0x00: The type of attribute
+	__le16 size;		// 0x04: The size of this record
+	u8 name_len;		// 0x06: The length of attribute name
+	u8 name_off;		// 0x07: The offset to attribute name
+	__le64 vcn;		// 0x08: Starting VCN of this attribute
+	struct MFT_REF ref;	// 0x10: MFT record number with attribute
+	__le16 id;		// 0x18: struct ATTRIB ID
+	__le16 name[3];		// 0x1A: Just to align. To get real name can use bNameOffset
+
+}; // sizeof(0x20)
+
+static_assert(sizeof(struct ATTR_LIST_ENTRY) == 0x20);
+
+static inline u32 le_size(u8 name_len)
+{
+	return QuadAlign(offsetof(struct ATTR_LIST_ENTRY, name) +
+			 name_len * sizeof(short));
+}
+
+/* returns 0 if 'attr' has the same type and name */
+static inline int le_cmp(const struct ATTR_LIST_ENTRY *le,
+			 const struct ATTRIB *attr)
+{
+	return le->type != attr->type || le->name_len != attr->name_len ||
+	       (!le->name_len &&
+		memcmp(Add2Ptr(le, le->name_off),
+		       Add2Ptr(attr, le16_to_cpu(attr->name_off)),
+		       le->name_len * sizeof(short)));
+}
+
+static const inline __le16 *le_name(const struct ATTR_LIST_ENTRY *le)
+{
+	return Add2Ptr(le, le->name_off);
+}
+
+/* File name types (the field type in struct ATTR_FILE_NAME ) */
+#define FILE_NAME_POSIX   0
+#define FILE_NAME_UNICODE 1
+#define FILE_NAME_DOS	  2
+#define FILE_NAME_UNICODE_AND_DOS (FILE_NAME_DOS | FILE_NAME_UNICODE)
+
+/* Filename attribute structure (0x30) */
+struct NTFS_DUP_INFO {
+	__le64 cr_time;		// 0x00: File creation file
+	__le64 m_time;		// 0x08: File modification time
+	__le64 c_time;		// 0x10: Last time any attribute was modified
+	__le64 a_time;		// 0x18: File last access time
+	__le64 alloc_size;	// 0x20: Data attribute allocated size, multiple of cluster size
+	__le64 data_size;	// 0x28: Data attribute size <= Dataalloc_size
+	enum FILE_ATTRIBUTE fa;	// 0x30: Standard DOS attributes & more
+	__le16 ea_size;		// 0x34: Packed EAs
+	__le16 reparse;		// 0x36: Used by Reparse
+
+}; // 0x38
+
+struct ATTR_FILE_NAME {
+	struct MFT_REF home;	// 0x00: MFT record for directory
+	struct NTFS_DUP_INFO dup;// 0x08
+	u8 name_len;		// 0x40: File name length in words
+	u8 type;		// 0x41: File name type
+	__le16 name[1];		// 0x42: File name
+};
+
+static_assert(sizeof(((struct ATTR_FILE_NAME *)NULL)->dup) == 0x38);
+static_assert(offsetof(struct ATTR_FILE_NAME, name) == 0x42);
+#define SIZEOF_ATTRIBUTE_FILENAME     0x44
+#define SIZEOF_ATTRIBUTE_FILENAME_MAX (0x42 + 255 * 2)
+
+static inline struct ATTRIB *attr_from_name(struct ATTR_FILE_NAME *fname)
+{
+	return (struct ATTRIB *)((char *)fname - SIZEOF_RESIDENT);
+}
+
+static inline u16 fname_full_size(const struct ATTR_FILE_NAME *fname)
+{
+	return offsetof(struct ATTR_FILE_NAME, name) +
+	       fname->name_len * sizeof(short);
+}
+
+static inline u8 paired_name(u8 type)
+{
+	if (type == FILE_NAME_UNICODE)
+		return FILE_NAME_DOS;
+	if (type == FILE_NAME_DOS)
+		return FILE_NAME_UNICODE;
+	return FILE_NAME_POSIX;
+}
+
+/* Index entry defines ( the field flags in NtfsDirEntry ) */
+#define NTFS_IE_HAS_SUBNODES	cpu_to_le16(1)
+#define NTFS_IE_LAST		cpu_to_le16(2)
+
+/* Directory entry structure */
+struct NTFS_DE {
+	union {
+		struct MFT_REF ref; // 0x00: MFT record number with this file
+		struct {
+			__le16 data_off;  // 0x00:
+			__le16 data_size; // 0x02:
+			__le32 res;	  // 0x04: must be 0
+		} view;
+	};
+	__le16 size;		// 0x08: The size of this entry
+	__le16 key_size;	// 0x0A: The size of File name length in bytes + 0x42
+	__le16 flags;		// 0x0C: Entry flags: NTFS_IE_XXX
+	__le16 res;		// 0x0E:
+
+	// Here any indexed attribute can be placed
+	// One of them is:
+	// struct ATTR_FILE_NAME AttrFileName;
+	//
+
+	// The last 8 bytes of this structure contains
+	// the VBN of subnode
+	// !!! Note !!!
+	// This field is presented only if (flags & NTFS_IE_HAS_SUBNODES)
+	// __le64 vbn;
+};
+
+static_assert(sizeof(struct NTFS_DE) == 0x10);
+
+static inline void de_set_vbn_le(struct NTFS_DE *e, __le64 vcn)
+{
+	__le64 *v = Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+
+	*v = vcn;
+}
+
+static inline void de_set_vbn(struct NTFS_DE *e, CLST vcn)
+{
+	__le64 *v = Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+
+	*v = cpu_to_le64(vcn);
+}
+
+static inline __le64 de_get_vbn_le(const struct NTFS_DE *e)
+{
+	return *(__le64 *)Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+}
+
+static inline CLST de_get_vbn(const struct NTFS_DE *e)
+{
+	__le64 *v = Add2Ptr(e, le16_to_cpu(e->size) - sizeof(__le64));
+
+	return le64_to_cpu(*v);
+}
+
+static inline struct NTFS_DE *de_get_next(const struct NTFS_DE *e)
+{
+	return Add2Ptr(e, le16_to_cpu(e->size));
+}
+
+static inline struct ATTR_FILE_NAME *de_get_fname(const struct NTFS_DE *e)
+{
+	return le16_to_cpu(e->key_size) >= SIZEOF_ATTRIBUTE_FILENAME ?
+		       Add2Ptr(e, sizeof(struct NTFS_DE)) :
+		       NULL;
+}
+
+static inline bool de_is_last(const struct NTFS_DE *e)
+{
+	return e->flags & NTFS_IE_LAST;
+}
+
+static inline bool de_has_vcn(const struct NTFS_DE *e)
+{
+	return e->flags & NTFS_IE_HAS_SUBNODES;
+}
+
+static inline bool de_has_vcn_ex(const struct NTFS_DE *e)
+{
+	return (e->flags & NTFS_IE_HAS_SUBNODES) &&
+	       (u64)(-1) != *((u64 *)Add2Ptr(e, le16_to_cpu(e->size) -
+							sizeof(__le64)));
+}
+
+#define MAX_BYTES_PER_NAME_ENTRY					       \
+	QuadAlign(sizeof(struct NTFS_DE) +				       \
+		  offsetof(struct ATTR_FILE_NAME, name) +		       \
+		  NTFS_NAME_LEN * sizeof(short))
+
+struct INDEX_HDR {
+	__le32 de_off;	// 0x00: The offset from the start of this structure
+			// to the first NTFS_DE
+	__le32 used;	// 0x04: The size of this structure plus all
+			// entries (quad-word aligned)
+	__le32 total;	// 0x08: The allocated size of for this structure plus all entries
+	u8 flags;	// 0x0C: 0x00 = Small directory, 0x01 = Large directory
+	u8 res[3];
+
+	//
+	// de_off + used <= total
+	//
+};
+
+static_assert(sizeof(struct INDEX_HDR) == 0x10);
+
+static inline struct NTFS_DE *hdr_first_de(const struct INDEX_HDR *hdr)
+{
+	u32 de_off = le32_to_cpu(hdr->de_off);
+	u32 used = le32_to_cpu(hdr->used);
+	struct NTFS_DE *e = Add2Ptr(hdr, de_off);
+	u16 esize;
+
+	if (de_off >= used || de_off >= le32_to_cpu(hdr->total))
+		return NULL;
+
+	esize = le16_to_cpu(e->size);
+	if (esize < sizeof(struct NTFS_DE) || de_off + esize > used)
+		return NULL;
+
+	return e;
+}
+
+static inline struct NTFS_DE *hdr_next_de(const struct INDEX_HDR *hdr,
+					  const struct NTFS_DE *e)
+{
+	size_t off = PtrOffset(hdr, e);
+	u32 used = le32_to_cpu(hdr->used);
+	u16 esize;
+
+	if (off >= used)
+		return NULL;
+
+	esize = le16_to_cpu(e->size);
+
+	if (esize < sizeof(struct NTFS_DE) ||
+	    off + esize + sizeof(struct NTFS_DE) > used)
+		return NULL;
+
+	return Add2Ptr(e, esize);
+}
+
+static inline bool hdr_has_subnode(const struct INDEX_HDR *hdr)
+{
+	return hdr->flags & 1;
+}
+
+struct INDEX_BUFFER {
+	struct NTFS_RECORD_HEADER rhdr; // 'INDX'
+	__le64 vbn; // 0x10: vcn if index >= cluster or vsn id index < cluster
+	struct INDEX_HDR ihdr; // 0x18:
+};
+
+static_assert(sizeof(struct INDEX_BUFFER) == 0x28);
+
+static inline bool ib_is_empty(const struct INDEX_BUFFER *ib)
+{
+	const struct NTFS_DE *first = hdr_first_de(&ib->ihdr);
+
+	return !first || de_is_last(first);
+}
+
+static inline bool ib_is_leaf(const struct INDEX_BUFFER *ib)
+{
+	return !(ib->ihdr.flags & 1);
+}
+
+/* Index root structure ( 0x90 ) */
+enum COLLATION_RULE {
+	NTFS_COLLATION_TYPE_BINARY	= cpu_to_le32(0),
+	// $I30
+	NTFS_COLLATION_TYPE_FILENAME	= cpu_to_le32(0x01),
+	// $SII of $Secure and $Q of Quota
+	NTFS_COLLATION_TYPE_UINT	= cpu_to_le32(0x10),
+	// $O of Quota
+	NTFS_COLLATION_TYPE_SID		= cpu_to_le32(0x11),
+	// $SDH of $Secure
+	NTFS_COLLATION_TYPE_SECURITY_HASH = cpu_to_le32(0x12),
+	// $O of ObjId and "$R" for Reparse
+	NTFS_COLLATION_TYPE_UINTS	= cpu_to_le32(0x13)
+};
+
+static_assert(sizeof(enum COLLATION_RULE) == 4);
+
+//
+struct INDEX_ROOT {
+	enum ATTR_TYPE type;	// 0x00: The type of attribute to index on
+	enum COLLATION_RULE rule; // 0x04: The rule
+	__le32 index_block_size;// 0x08: The size of index record
+	u8 index_block_clst;	// 0x0C: The number of clusters per index
+	u8 res[3];
+	struct INDEX_HDR ihdr;	// 0x10:
+};
+
+static_assert(sizeof(struct INDEX_ROOT) == 0x20);
+static_assert(offsetof(struct INDEX_ROOT, ihdr) == 0x10);
+
+#define VOLUME_FLAG_DIRTY	    cpu_to_le16(0x0001)
+#define VOLUME_FLAG_RESIZE_LOG_FILE cpu_to_le16(0x0002)
+
+struct VOLUME_INFO {
+	__le64 res1;	// 0x00
+	u8 major_ver;	// 0x08: NTFS major version number (before .)
+	u8 minor_ver;	// 0x09: NTFS minor version number (after .)
+	__le16 flags;	// 0x0A: Volume flags, see VOLUME_FLAG_XXX
+
+}; // sizeof=0xC
+
+#define SIZEOF_ATTRIBUTE_VOLUME_INFO 0xc
+
+#define NTFS_LABEL_MAX_LENGTH		(0x100 / sizeof(short))
+#define NTFS_ATTR_INDEXABLE		cpu_to_le32(0x00000002)
+#define NTFS_ATTR_DUPALLOWED		cpu_to_le32(0x00000004)
+#define NTFS_ATTR_MUST_BE_INDEXED	cpu_to_le32(0x00000010)
+#define NTFS_ATTR_MUST_BE_NAMED		cpu_to_le32(0x00000020)
+#define NTFS_ATTR_MUST_BE_RESIDENT	cpu_to_le32(0x00000040)
+#define NTFS_ATTR_LOG_ALWAYS		cpu_to_le32(0x00000080)
+
+/* $AttrDef file entry */
+struct ATTR_DEF_ENTRY {
+	__le16 name[0x40];	// 0x00: Attr name
+	enum ATTR_TYPE type;	// 0x80: struct ATTRIB type
+	__le32 res;		// 0x84:
+	enum COLLATION_RULE rule; // 0x88:
+	__le32 flags;		// 0x8C: NTFS_ATTR_XXX (see above)
+	__le64 min_sz;		// 0x90: Minimum attribute data size
+	__le64 max_sz;		// 0x98: Maximum attribute data size
+};
+
+static_assert(sizeof(struct ATTR_DEF_ENTRY) == 0xa0);
+
+/* Object ID (0x40) */
+struct OBJECT_ID {
+	struct GUID ObjId;	// 0x00: Unique Id assigned to file
+	struct GUID BirthVolumeId;// 0x10: Birth Volume Id is the Object Id of the Volume on
+				// which the Object Id was allocated. It never changes
+	struct GUID BirthObjectId; // 0x20: Birth Object Id is the first Object Id that was
+				// ever assigned to this MFT Record. I.e. If the Object Id
+				// is changed for some reason, this field will reflect the
+				// original value of the Object Id.
+	struct GUID DomainId;	// 0x30: Domain Id is currently unused but it is intended to be
+				// used in a network environment where the local machine is
+				// part of a Windows 2000 Domain. This may be used in a Windows
+				// 2000 Advanced Server managed domain.
+};
+
+static_assert(sizeof(struct OBJECT_ID) == 0x40);
+
+/* O Directory entry structure ( rule = 0x13 ) */
+struct NTFS_DE_O {
+	struct NTFS_DE de;
+	struct GUID ObjId;	// 0x10: Unique Id assigned to file
+	struct MFT_REF ref;	// 0x20: MFT record number with this file
+	struct GUID BirthVolumeId; // 0x28: Birth Volume Id is the Object Id of the Volume on
+				// which the Object Id was allocated. It never changes
+	struct GUID BirthObjectId; // 0x38: Birth Object Id is the first Object Id that was
+				// ever assigned to this MFT Record. I.e. If the Object Id
+				// is changed for some reason, this field will reflect the
+				// original value of the Object Id.
+				// This field is valid if data_size == 0x48
+	struct GUID BirthDomainId; // 0x48: Domain Id is currently unused but it is intended
+				// to be used in a network environment where the local
+				// machine is part of a Windows 2000 Domain. This may be
+				// used in a Windows 2000 Advanced Server managed domain.
+};
+
+static_assert(sizeof(struct NTFS_DE_O) == 0x58);
+
+#define NTFS_OBJECT_ENTRY_DATA_SIZE1					       \
+	0x38 // struct NTFS_DE_O.BirthDomainId is not used
+#define NTFS_OBJECT_ENTRY_DATA_SIZE2					       \
+	0x48 // struct NTFS_DE_O.BirthDomainId is used
+
+/* Q Directory entry structure ( rule = 0x11 ) */
+struct NTFS_DE_Q {
+	struct NTFS_DE de;
+	__le32 owner_id;	// 0x10: Unique Id assigned to file
+	__le32 Version;		// 0x14: 0x02
+	__le32 flags2;		// 0x18: Quota flags, see above
+	__le64 BytesUsed;	// 0x1C:
+	__le64 ChangeTime;	// 0x24:
+	__le64 WarningLimit;	// 0x28:
+	__le64 HardLimit;	// 0x34:
+	__le64 ExceededTime;	// 0x3C:
+
+	// SID is placed here
+}; // sizeof() = 0x44
+
+#define SIZEOF_NTFS_DE_Q 0x44
+
+#define SecurityDescriptorsBlockSize 0x40000 // 256K
+#define SecurityDescriptorMaxSize    0x20000 // 128K
+#define Log2OfSecurityDescriptorsBlockSize 18
+
+struct SECURITY_KEY {
+	__le32 hash; //  Hash value for descriptor
+	__le32 sec_id; //  Security Id (guaranteed unique)
+};
+
+/* Security descriptors (the content of $Secure::SDS data stream) */
+struct SECURITY_HDR {
+	struct SECURITY_KEY key;	// 0x00: Security Key
+	__le64 off;			// 0x08: Offset of this entry in the file
+	__le32 size;			// 0x10: Size of this entry, 8 byte aligned
+	//
+	// Security descriptor itself is placed here
+	// Total size is 16 byte aligned
+	//
+} __packed;
+
+#define SIZEOF_SECURITY_HDR 0x14
+
+/* SII Directory entry structure */
+struct NTFS_DE_SII {
+	struct NTFS_DE de;
+	__le32 sec_id;			// 0x10: Key: sizeof(security_id) = wKeySize
+	struct SECURITY_HDR sec_hdr;	// 0x14:
+} __packed;
+
+#define SIZEOF_SII_DIRENTRY 0x28
+
+/* SDH Directory entry structure */
+struct NTFS_DE_SDH {
+	struct NTFS_DE de;
+	struct SECURITY_KEY key;	// 0x10: Key
+	struct SECURITY_HDR sec_hdr;	// 0x18: Data
+	__le16 magic[2];		// 0x2C: 0x00490049 "I I"
+};
+
+#define SIZEOF_SDH_DIRENTRY 0x30
+
+struct REPARSE_KEY {
+	__le32 ReparseTag;		// 0x00: Reparse Tag
+	struct MFT_REF ref;		// 0x04: MFT record number with this file
+}; // sizeof() = 0x0C
+
+static_assert(offsetof(struct REPARSE_KEY, ref) == 0x04);
+#define SIZEOF_REPARSE_KEY 0x0C
+
+/* Reparse Directory entry structure */
+struct NTFS_DE_R {
+	struct NTFS_DE de;
+	struct REPARSE_KEY key;		// 0x10: Reparse Key
+	u32 zero;			// 0x1c
+}; // sizeof() = 0x20
+
+static_assert(sizeof(struct NTFS_DE_R) == 0x20);
+
+/* CompressReparseBuffer.WofVersion */
+#define WOF_CURRENT_VERSION		cpu_to_le32(1)
+/* CompressReparseBuffer.WofProvider */
+#define WOF_PROVIDER_WIM		cpu_to_le32(1)
+/* CompressReparseBuffer.WofProvider */
+#define WOF_PROVIDER_SYSTEM		cpu_to_le32(2)
+/* CompressReparseBuffer.ProviderVer */
+#define WOF_PROVIDER_CURRENT_VERSION	cpu_to_le32(1)
+
+#define WOF_COMPRESSION_XPRESS4K	cpu_to_le32(0) // 4k
+#define WOF_COMPRESSION_LZX32K		cpu_to_le32(1) // 32k
+#define WOF_COMPRESSION_XPRESS8K	cpu_to_le32(2) // 8k
+#define WOF_COMPRESSION_XPRESS16K	cpu_to_le32(3) // 16k
+
+/*
+ * ATTR_REPARSE (0xC0)
+ *
+ * The reparse struct GUID structure is used by all 3rd party layered drivers to
+ * store data in a reparse point. For non-Microsoft tags, The struct GUID field
+ * cannot be GUID_NULL.
+ * The constraints on reparse tags are defined below.
+ * Microsoft tags can also be used with this format of the reparse point buffer.
+ */
+struct REPARSE_POINT {
+	__le32 ReparseTag;	// 0x00:
+	__le16 ReparseDataLength;// 0x04:
+	__le16 Reserved;
+
+	struct GUID Guid;	// 0x08:
+
+	//
+	// Here GenericReparseBuffer is placed
+	//
+};
+
+static_assert(sizeof(struct REPARSE_POINT) == 0x18);
+
+//
+// Maximum allowed size of the reparse data.
+//
+#define MAXIMUM_REPARSE_DATA_BUFFER_SIZE	(16 * 1024)
+
+//
+// The value of the following constant needs to satisfy the following
+// conditions:
+//  (1) Be at least as large as the largest of the reserved tags.
+//  (2) Be strictly smaller than all the tags in use.
+//
+#define IO_REPARSE_TAG_RESERVED_RANGE		1
+
+//
+// The reparse tags are a ULONG. The 32 bits are laid out as follows:
+//
+//   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
+//   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+//  +-+-+-+-+-----------------------+-------------------------------+
+//  |M|R|N|R|	  Reserved bits     |	    Reparse Tag Value	    |
+//  +-+-+-+-+-----------------------+-------------------------------+
+//
+// M is the Microsoft bit. When set to 1, it denotes a tag owned by Microsoft.
+//   All ISVs must use a tag with a 0 in this position.
+//   Note: If a Microsoft tag is used by non-Microsoft software, the
+//   behavior is not defined.
+//
+// R is reserved.  Must be zero for non-Microsoft tags.
+//
+// N is name surrogate. When set to 1, the file represents another named
+//   entity in the system.
+//
+// The M and N bits are OR-able.
+// The following macros check for the M and N bit values:
+//
+
+//
+// Macro to determine whether a reparse point tag corresponds to a tag
+// owned by Microsoft.
+//
+#define IsReparseTagMicrosoft(_tag)	(((_tag)&IO_REPARSE_TAG_MICROSOFT))
+
+//
+// Macro to determine whether a reparse point tag is a name surrogate
+//
+#define IsReparseTagNameSurrogate(_tag)	(((_tag)&IO_REPARSE_TAG_NAME_SURROGATE))
+
+//
+// The following constant represents the bits that are valid to use in
+// reparse tags.
+//
+#define IO_REPARSE_TAG_VALID_VALUES	0xF000FFFF
+
+//
+// Macro to determine whether a reparse tag is a valid tag.
+//
+#define IsReparseTagValid(_tag)						       \
+	(!((_tag) & ~IO_REPARSE_TAG_VALID_VALUES) &&			       \
+	 ((_tag) > IO_REPARSE_TAG_RESERVED_RANGE))
+
+//
+// Microsoft tags for reparse points.
+//
+
+enum IO_REPARSE_TAG {
+	IO_REPARSE_TAG_SYMBOLIC_LINK	= cpu_to_le32(0),
+	IO_REPARSE_TAG_NAME_SURROGATE	= cpu_to_le32(0x20000000),
+	IO_REPARSE_TAG_MICROSOFT	= cpu_to_le32(0x80000000),
+	IO_REPARSE_TAG_MOUNT_POINT	= cpu_to_le32(0xA0000003),
+	IO_REPARSE_TAG_SYMLINK		= cpu_to_le32(0xA000000C),
+	IO_REPARSE_TAG_HSM		= cpu_to_le32(0xC0000004),
+	IO_REPARSE_TAG_SIS		= cpu_to_le32(0x80000007),
+	IO_REPARSE_TAG_DEDUP		= cpu_to_le32(0x80000013),
+	IO_REPARSE_TAG_COMPRESS		= cpu_to_le32(0x80000017),
+
+	//
+	// The reparse tag 0x80000008 is reserved for Microsoft internal use
+	// (may be published in the future)
+	//
+
+	//
+	// Microsoft reparse tag reserved for DFS
+	//
+	IO_REPARSE_TAG_DFS		= cpu_to_le32(0x8000000A),
+
+	//
+	// Microsoft reparse tag reserved for the file system filter manager
+	//
+	IO_REPARSE_TAG_FILTER_MANAGER	= cpu_to_le32(0x8000000B),
+
+	//
+	// Non-Microsoft tags for reparse points
+	//
+
+	//
+	// Tag allocated to CONGRUENT, May 2000. Used by IFSTEST
+	//
+	IO_REPARSE_TAG_IFSTEST_CONGRUENT = cpu_to_le32(0x00000009),
+
+	//
+	// Tag allocated to ARKIVIO
+	//
+	IO_REPARSE_TAG_ARKIVIO		= cpu_to_le32(0x0000000C),
+
+	//
+	//  Tag allocated to SOLUTIONSOFT
+	//
+	IO_REPARSE_TAG_SOLUTIONSOFT	= cpu_to_le32(0x2000000D),
+
+	//
+	//  Tag allocated to COMMVAULT
+	//
+	IO_REPARSE_TAG_COMMVAULT	= cpu_to_le32(0x0000000E),
+
+	// OneDrive??
+	IO_REPARSE_TAG_CLOUD		= cpu_to_le32(0x9000001A),
+	IO_REPARSE_TAG_CLOUD_1		= cpu_to_le32(0x9000101A),
+	IO_REPARSE_TAG_CLOUD_2		= cpu_to_le32(0x9000201A),
+	IO_REPARSE_TAG_CLOUD_3		= cpu_to_le32(0x9000301A),
+	IO_REPARSE_TAG_CLOUD_4		= cpu_to_le32(0x9000401A),
+	IO_REPARSE_TAG_CLOUD_5		= cpu_to_le32(0x9000501A),
+	IO_REPARSE_TAG_CLOUD_6		= cpu_to_le32(0x9000601A),
+	IO_REPARSE_TAG_CLOUD_7		= cpu_to_le32(0x9000701A),
+	IO_REPARSE_TAG_CLOUD_8		= cpu_to_le32(0x9000801A),
+	IO_REPARSE_TAG_CLOUD_9		= cpu_to_le32(0x9000901A),
+	IO_REPARSE_TAG_CLOUD_A		= cpu_to_le32(0x9000A01A),
+	IO_REPARSE_TAG_CLOUD_B		= cpu_to_le32(0x9000B01A),
+	IO_REPARSE_TAG_CLOUD_C		= cpu_to_le32(0x9000C01A),
+	IO_REPARSE_TAG_CLOUD_D		= cpu_to_le32(0x9000D01A),
+	IO_REPARSE_TAG_CLOUD_E		= cpu_to_le32(0x9000E01A),
+	IO_REPARSE_TAG_CLOUD_F		= cpu_to_le32(0x9000F01A),
+
+};
+
+#define SYMLINK_FLAG_RELATIVE		1
+
+/* Microsoft reparse buffer. (see DDK for details) */
+struct REPARSE_DATA_BUFFER {
+	__le32 ReparseTag;		// 0x00:
+	__le16 ReparseDataLength;	// 0x04:
+	__le16 Reserved;
+
+	union {
+		// If ReparseTag == 0xA0000003 (IO_REPARSE_TAG_MOUNT_POINT)
+		struct {
+			__le16 SubstituteNameOffset; // 0x08
+			__le16 SubstituteNameLength; // 0x0A
+			__le16 PrintNameOffset;      // 0x0C
+			__le16 PrintNameLength;      // 0x0E
+			__le16 PathBuffer[1];	     // 0x10
+		} MountPointReparseBuffer;
+
+		// If ReparseTag == 0xA000000C (IO_REPARSE_TAG_SYMLINK)
+		// https://msdn.microsoft.com/en-us/library/cc232006.aspx
+		struct {
+			__le16 SubstituteNameOffset; // 0x08
+			__le16 SubstituteNameLength; // 0x0A
+			__le16 PrintNameOffset;      // 0x0C
+			__le16 PrintNameLength;      // 0x0E
+			// 0-absolute path 1- relative path, SYMLINK_FLAG_RELATIVE
+			__le32 Flags;		     // 0x10
+			__le16 PathBuffer[1];	     // 0x14
+		} SymbolicLinkReparseBuffer;
+
+		// If ReparseTag == 0x80000017U
+		struct {
+			__le32 WofVersion;  // 0x08 == 1
+			/* 1 - WIM backing provider ("WIMBoot"),
+			 * 2 - System compressed file provider
+			 */
+			__le32 WofProvider; // 0x0C
+			__le32 ProviderVer; // 0x10: == 1 WOF_FILE_PROVIDER_CURRENT_VERSION == 1
+			__le32 CompressionFormat; // 0x14: 0, 1, 2, 3. See WOF_COMPRESSION_XXX
+		} CompressReparseBuffer;
+
+		struct {
+			u8 DataBuffer[1];   // 0x08
+		} GenericReparseBuffer;
+	};
+};
+
+/* ATTR_EA_INFO (0xD0) */
+
+#define FILE_NEED_EA 0x80 // See ntifs.h
+/* FILE_NEED_EA, indicates that the file to which the EA belongs cannot be
+ * interpreted without understanding the associated extended attributes.
+ */
+struct EA_INFO {
+	__le16 size_pack;	// 0x00: Size of buffer to hold in packed form
+	__le16 count;		// 0x02: Count of EA's with FILE_NEED_EA bit set
+	__le32 size;		// 0x04: Size of buffer to hold in unpacked form
+};
+
+static_assert(sizeof(struct EA_INFO) == 8);
+
+/* ATTR_EA (0xE0) */
+struct EA_FULL {
+	__le32 size;		// 0x00: (not in packed)
+	u8 flags;		// 0x04
+	u8 name_len;		// 0x05
+	__le16 elength;		// 0x06
+	u8 name[1];		// 0x08
+};
+
+static_assert(offsetof(struct EA_FULL, name) == 8);
+
+#define MAX_EA_DATA_SIZE (256 * 1024)
+
+#define ACL_REVISION 2
+
+#define SE_SELF_RELATIVE cpu_to_le16(0x8000)
+
+struct SECURITY_DESCRIPTOR_RELATIVE {
+	u8 Revision;
+	u8 Sbz1;
+	__le16 Control;
+	__le32 Owner;
+	__le32 Group;
+	__le32 Sacl;
+	__le32 Dacl;
+};
+static_assert(sizeof(struct SECURITY_DESCRIPTOR_RELATIVE) == 0x14);
+
+struct ACE_HEADER {
+	u8 AceType;
+	u8 AceFlags;
+	__le16 AceSize;
+};
+static_assert(sizeof(struct ACE_HEADER) == 4);
+
+struct ACL {
+	u8 AclRevision;
+	u8 Sbz1;
+	__le16 AclSize;
+	__le16 AceCount;
+	__le16 Sbz2;
+};
+static_assert(sizeof(struct ACL) == 8);
+
+struct SID {
+	u8 Revision;
+	u8 SubAuthorityCount;
+	u8 IdentifierAuthority[6];
+	__le32 SubAuthority[1];
+};
+static_assert(offsetof(struct SID, SubAuthority) == 8);
+
+// clang-format on
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
new file mode 100644
index 000000000000..09e183f95bd1
--- /dev/null
+++ b/fs/ntfs3/ntfs_fs.h
@@ -0,0 +1,1065 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+// clang-format off
+#define MINUS_ONE_T			((size_t)(-1))
+/* Biggest MFT / smallest cluster */
+#define MAXIMUM_BYTES_PER_MFT		4096
+#define NTFS_BLOCKS_PER_MFT_RECORD	(MAXIMUM_BYTES_PER_MFT / 512)
+
+#define MAXIMUM_BYTES_PER_INDEX		4096
+#define NTFS_BLOCKS_PER_INODE		(MAXIMUM_BYTES_PER_INDEX / 512)
+
+/* ntfs specific error code when fixup failed*/
+#define E_NTFS_FIXUP			555
+/* ntfs specific error code about resident->nonresident*/
+#define E_NTFS_NONRESIDENT		556
+
+/* sbi->flags */
+#define NTFS_FLAGS_NODISCARD		0x00000001
+#define NTFS_FLAGS_NEED_REPLAY		0x04000000
+
+/* ni->ni_flags */
+/*
+ * Data attribute is external compressed (lzx/xpress)
+ * 1 - WOF_COMPRESSION_XPRESS4K
+ * 2 - WOF_COMPRESSION_XPRESS8K
+ * 3 - WOF_COMPRESSION_XPRESS16K
+ * 4 - WOF_COMPRESSION_LZX32K
+ */
+#define NI_FLAG_COMPRESSED_MASK		0x0000000f
+/* Data attribute is deduplicated */
+#define NI_FLAG_DEDUPLICATED		0x00000010
+#define NI_FLAG_EA			0x00000020
+#define NI_FLAG_DIR			0x00000040
+#define NI_FLAG_RESIDENT		0x00000080
+#define NI_FLAG_UPDATE_PARENT		0x00000100
+// clang-format on
+
+struct ntfs_mount_options {
+	struct nls_table *nls;
+
+	kuid_t fs_uid;
+	kgid_t fs_gid;
+	u16 fs_fmask_inv;
+	u16 fs_dmask_inv;
+
+	unsigned uid : 1, /* uid was set */
+		gid : 1, /* gid was set */
+		fmask : 1, /* fmask was set */
+		dmask : 1, /*dmask was set*/
+		sys_immutable : 1, /* immutable system files */
+		discard : 1, /* issue discard requests on deletions */
+		sparse : 1, /*create sparse files*/
+		showmeta : 1, /*show meta files*/
+		nohidden : 1, /*do not show hidden files*/
+		force : 1, /*rw mount dirty volume*/
+		no_acs_rules : 1, /*exclude acs rules*/
+		prealloc : 1 /*preallocate space when file is growing*/
+		;
+};
+
+/* special value to unpack and deallocate*/
+#define RUN_DEALLOCATE ((struct runs_tree *)(size_t)1)
+
+/* TODO: use rb tree instead of array */
+struct runs_tree {
+	struct ntfs_run *runs_;
+	size_t count; // Currently used size a ntfs_run storage.
+	size_t allocated; // Currently allocated ntfs_run storage size.
+};
+
+struct ntfs_buffers {
+	/* Biggest MFT / smallest cluster = 4096 / 512 = 8 */
+	/* Biggest index / smallest cluster = 4096 / 512 = 8 */
+	struct buffer_head *bh[PAGE_SIZE >> SECTOR_SHIFT];
+	u32 bytes;
+	u32 nbufs;
+	u32 off;
+};
+
+enum ALLOCATE_OPT {
+	ALLOCATE_DEF = 0, // Allocate all clusters
+	ALLOCATE_MFT = 1, // Allocate for MFT
+};
+
+enum bitmap_mutex_classes {
+	BITMAP_MUTEX_CLUSTERS = 0,
+	BITMAP_MUTEX_MFT = 1,
+};
+
+struct wnd_bitmap {
+	struct super_block *sb;
+	struct rw_semaphore rw_lock;
+
+	struct runs_tree run;
+	size_t nbits;
+
+	u16 free_holder[8]; // holder for free_bits
+
+	size_t total_zeroes; // total number of free bits
+	u16 *free_bits; // free bits in each window
+	size_t nwnd;
+	u32 bits_last; // bits in last window
+
+	struct rb_root start_tree; // extents, sorted by 'start'
+	struct rb_root count_tree; // extents, sorted by 'count + start'
+	size_t count; // extents count
+
+	/*
+	 * -1 Tree is activated but not updated (too many fragments)
+	 * 0 - Tree is not activated
+	 * 1 - Tree is activated and updated
+	 */
+	int uptodated;
+	size_t extent_min; // Minimal extent used while building
+	size_t extent_max; // Upper estimate of biggest free block
+
+	/* Zone [bit, end) */
+	size_t zone_bit;
+	size_t zone_end;
+
+	bool set_tail; // not necessary in driver
+	bool inited;
+};
+
+typedef int (*NTFS_CMP_FUNC)(const void *key1, size_t len1, const void *key2,
+			     size_t len2, const void *param);
+
+enum index_mutex_classed {
+	INDEX_MUTEX_I30 = 0,
+	INDEX_MUTEX_SII = 1,
+	INDEX_MUTEX_SDH = 2,
+	INDEX_MUTEX_SO = 3,
+	INDEX_MUTEX_SQ = 4,
+	INDEX_MUTEX_SR = 5,
+	INDEX_MUTEX_TOTAL
+};
+
+/* This struct works with indexes */
+struct ntfs_index {
+	struct runs_tree bitmap_run;
+	struct runs_tree alloc_run;
+	/* read/write access to 'bitmap_run'/'alloc_run' while ntfs_readdir */
+	struct rw_semaphore run_lock;
+
+	/*TODO: remove 'cmp'*/
+	NTFS_CMP_FUNC cmp;
+
+	u8 index_bits; // log2(root->index_block_size)
+	u8 idx2vbn_bits; // log2(root->index_block_clst)
+	u8 vbn2vbo_bits; // index_block_size < cluster? 9 : cluster_bits
+	u8 type; // index_mutex_classed
+};
+
+/* Set when $LogFile is replaying */
+#define NTFS_FLAGS_LOG_REPLAYING 0x00000008
+
+/* Set when we changed first MFT's which copy must be updated in $MftMirr */
+#define NTFS_FLAGS_MFTMIRR 0x00001000
+
+/* Minimum mft zone */
+#define NTFS_MIN_MFT_ZONE 100
+
+/* ntfs file system in-core superblock data */
+struct ntfs_sb_info {
+	struct super_block *sb;
+
+	u32 discard_granularity;
+	u64 discard_granularity_mask_inv; // ~(discard_granularity_mask_inv-1)
+
+	u32 cluster_size; // bytes per cluster
+	u32 cluster_mask; // == cluster_size - 1
+	u64 cluster_mask_inv; // ~(cluster_size - 1)
+	u32 block_mask; // sb->s_blocksize - 1
+	u32 blocks_per_cluster; // cluster_size / sb->s_blocksize
+
+	u32 record_size;
+	u32 sector_size;
+	u32 index_size;
+
+	u8 sector_bits;
+	u8 cluster_bits;
+	u8 record_bits;
+
+	u64 maxbytes; // Maximum size for normal files
+	u64 maxbytes_sparse; // Maximum size for sparse file
+
+	u32 flags; // See NTFS_FLAGS_XXX
+
+	CLST bad_clusters; // The count of marked bad clusters
+
+	u16 max_bytes_per_attr; // maximum attribute size in record
+	u16 attr_size_tr; // attribute size threshold (320 bytes)
+
+	/* Records in $Extend */
+	CLST objid_no;
+	CLST quota_no;
+	CLST reparse_no;
+	CLST usn_jrnl_no;
+
+	struct ATTR_DEF_ENTRY *def_table; // attribute definition table
+	u32 def_entries;
+
+	struct MFT_REC *new_rec;
+
+	u16 *upcase;
+
+	struct {
+		u64 lbo, lbo2;
+		struct ntfs_inode *ni;
+		struct wnd_bitmap bitmap; // $MFT::Bitmap
+		ulong reserved_bitmap;
+		size_t next_free; // The next record to allocate from
+		size_t used;
+		u32 recs_mirr; // Number of records MFTMirr
+		u8 next_reserved;
+		u8 reserved_bitmap_inited;
+	} mft;
+
+	struct {
+		struct wnd_bitmap bitmap; // $Bitmap::Data
+		CLST next_free_lcn;
+	} used;
+
+	struct {
+		u64 size; // in bytes
+		u64 blocks; // in blocks
+		u64 ser_num;
+		struct ntfs_inode *ni;
+		__le16 flags; // see VOLUME_FLAG_XXX
+		u8 major_ver;
+		u8 minor_ver;
+		char label[65];
+		bool real_dirty; /* real fs state*/
+	} volume;
+
+	struct {
+		struct ntfs_index index_sii;
+		struct ntfs_index index_sdh;
+		struct ntfs_inode *ni;
+		u32 next_id;
+		u64 next_off;
+
+		__le32 def_security_id;
+	} security;
+
+	struct {
+		struct ntfs_index index_r;
+		struct ntfs_inode *ni;
+		u64 max_size; // 16K
+	} reparse;
+
+	struct {
+		struct ntfs_index index_o;
+		struct ntfs_inode *ni;
+	} objid;
+
+	struct {
+		struct mutex mtx_lznt;
+		struct lznt *lznt;
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+		struct mutex mtx_xpress;
+		struct xpress_decompressor *xpress;
+		struct mutex mtx_lzx;
+		struct lzx_decompressor *lzx;
+#endif
+	} compress;
+
+	struct ntfs_mount_options options;
+	struct ratelimit_state msg_ratelimit;
+};
+
+/*
+ * one MFT record(usually 1024 bytes), consists of attributes
+ */
+struct mft_inode {
+	struct rb_node node;
+	struct ntfs_sb_info *sbi;
+
+	struct MFT_REC *mrec;
+	struct ntfs_buffers nb;
+
+	CLST rno;
+	bool dirty;
+};
+
+/* nested class for ntfs_inode::ni_lock */
+enum ntfs_inode_mutex_lock_class {
+	NTFS_INODE_MUTEX_DIRTY,
+	NTFS_INODE_MUTEX_SECURITY,
+	NTFS_INODE_MUTEX_OBJID,
+	NTFS_INODE_MUTEX_REPARSE,
+	NTFS_INODE_MUTEX_NORMAL,
+	NTFS_INODE_MUTEX_PARENT,
+};
+
+/*
+ * ntfs inode - extends linux inode. consists of one or more mft inodes
+ */
+struct ntfs_inode {
+	struct mft_inode mi; // base record
+
+	/*
+	 * Valid size: [0 - i_valid) - these range in file contains valid data
+	 * Range [i_valid - inode->i_size) - contains 0
+	 * Usually i_valid <= inode->i_size
+	 */
+	u64 i_valid;
+	struct timespec64 i_crtime;
+
+	struct mutex ni_lock;
+
+	/* file attributes from std */
+	enum FILE_ATTRIBUTE std_fa;
+	__le32 std_security_id;
+
+	/*
+	 * tree of mft_inode
+	 * not empty when primary MFT record (usually 1024 bytes) can't save all attributes
+	 * e.g. file becomes too fragmented or contains a lot of names
+	 */
+	struct rb_root mi_tree;
+
+	/*
+	 * This member is used in ntfs_readdir to ensure that all subrecords are loaded
+	 */
+	u8 mi_loaded;
+
+	union {
+		struct ntfs_index dir;
+		struct {
+			struct rw_semaphore run_lock;
+			struct runs_tree run;
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+			struct page *offs_page;
+#endif
+		} file;
+	};
+
+	struct {
+		struct runs_tree run;
+		struct ATTR_LIST_ENTRY *le; // 1K aligned memory
+		size_t size;
+		bool dirty;
+	} attr_list;
+
+	size_t ni_flags; // NI_FLAG_XXX
+
+	struct inode vfs_inode;
+};
+
+struct indx_node {
+	struct ntfs_buffers nb;
+	struct INDEX_BUFFER *index;
+};
+
+struct ntfs_fnd {
+	int level;
+	struct indx_node *nodes[20];
+	struct NTFS_DE *de[20];
+	struct NTFS_DE *root_de;
+};
+
+enum REPARSE_SIGN {
+	REPARSE_NONE = 0,
+	REPARSE_COMPRESSED = 1,
+	REPARSE_DEDUPLICATED = 2,
+	REPARSE_LINK = 3
+};
+
+/* functions from attrib.c*/
+int attr_load_runs(struct ATTRIB *attr, struct ntfs_inode *ni,
+		   struct runs_tree *run, const CLST *vcn);
+int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run,
+			   CLST vcn, CLST lcn, CLST len, CLST *pre_alloc,
+			   enum ALLOCATE_OPT opt, CLST *alen, const size_t fr,
+			   CLST *new_lcn);
+int attr_make_nonresident(struct ntfs_inode *ni, struct ATTRIB *attr,
+			  struct ATTR_LIST_ENTRY *le, struct mft_inode *mi,
+			  u64 new_size, struct runs_tree *run,
+			  struct ATTRIB **ins_attr, struct page *page);
+int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type,
+		  const __le16 *name, u8 name_len, struct runs_tree *run,
+		  u64 new_size, const u64 *new_valid, bool keep_prealloc,
+		  struct ATTRIB **ret);
+int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn,
+			CLST *len, bool *new);
+int attr_data_read_resident(struct ntfs_inode *ni, struct page *page);
+int attr_data_write_resident(struct ntfs_inode *ni, struct page *page);
+int attr_load_runs_vcn(struct ntfs_inode *ni, enum ATTR_TYPE type,
+		       const __le16 *name, u8 name_len, struct runs_tree *run,
+		       CLST vcn);
+int attr_load_runs_range(struct ntfs_inode *ni, enum ATTR_TYPE type,
+			 const __le16 *name, u8 name_len, struct runs_tree *run,
+			 u64 from, u64 to);
+int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr,
+			struct runs_tree *run, u64 frame, u64 frames,
+			u8 frame_bits, u32 *ondisk_size, u64 *vbo_data);
+int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr,
+			     CLST frame, CLST *clst_data);
+int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size,
+			u64 new_valid);
+int attr_collapse_range(struct ntfs_inode *ni, u64 vbo, u64 bytes);
+int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u64 bytes);
+
+/* functions from attrlist.c*/
+void al_destroy(struct ntfs_inode *ni);
+bool al_verify(struct ntfs_inode *ni);
+int ntfs_load_attr_list(struct ntfs_inode *ni, struct ATTRIB *attr);
+struct ATTR_LIST_ENTRY *al_enumerate(struct ntfs_inode *ni,
+				     struct ATTR_LIST_ENTRY *le);
+struct ATTR_LIST_ENTRY *al_find_le(struct ntfs_inode *ni,
+				   struct ATTR_LIST_ENTRY *le,
+				   const struct ATTRIB *attr);
+struct ATTR_LIST_ENTRY *al_find_ex(struct ntfs_inode *ni,
+				   struct ATTR_LIST_ENTRY *le,
+				   enum ATTR_TYPE type, const __le16 *name,
+				   u8 name_len, const CLST *vcn);
+int al_add_le(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name,
+	      u8 name_len, CLST svcn, __le16 id, const struct MFT_REF *ref,
+	      struct ATTR_LIST_ENTRY **new_le);
+bool al_remove_le(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le);
+bool al_delete_le(struct ntfs_inode *ni, enum ATTR_TYPE type, CLST vcn,
+		  const __le16 *name, size_t name_len,
+		  const struct MFT_REF *ref);
+int al_update(struct ntfs_inode *ni);
+static inline size_t al_aligned(size_t size)
+{
+	return (size + 1023) & ~(size_t)1023;
+}
+
+/* globals from bitfunc.c */
+bool are_bits_clear(const ulong *map, size_t bit, size_t nbits);
+bool are_bits_set(const ulong *map, size_t bit, size_t nbits);
+size_t get_set_bits_ex(const ulong *map, size_t bit, size_t nbits);
+
+/* globals from dir.c */
+int ntfs_utf16_to_nls(struct ntfs_sb_info *sbi, const struct le_str *uni,
+		      u8 *buf, int buf_len);
+int ntfs_nls_to_utf16(struct ntfs_sb_info *sbi, const u8 *name, u32 name_len,
+		      struct cpu_str *uni, u32 max_ulen,
+		      enum utf16_endian endian);
+struct inode *dir_search_u(struct inode *dir, const struct cpu_str *uni,
+			   struct ntfs_fnd *fnd);
+bool dir_is_empty(struct inode *dir);
+extern const struct file_operations ntfs_dir_operations;
+
+/* globals from file.c*/
+int ntfs_getattr(const struct path *path, struct kstat *stat, u32 request_mask,
+		 u32 flags);
+void ntfs_sparse_cluster(struct inode *inode, struct page *page0, CLST vcn,
+			 CLST len);
+int ntfs3_setattr(struct dentry *dentry, struct iattr *attr);
+int ntfs_file_open(struct inode *inode, struct file *file);
+int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+		__u64 start, __u64 len);
+extern const struct inode_operations ntfs_special_inode_operations;
+extern const struct inode_operations ntfs_file_inode_operations;
+extern const struct file_operations ntfs_file_operations;
+
+/* globals from frecord.c */
+void ni_remove_mi(struct ntfs_inode *ni, struct mft_inode *mi);
+struct ATTR_STD_INFO *ni_std(struct ntfs_inode *ni);
+struct ATTR_STD_INFO5 *ni_std5(struct ntfs_inode *ni);
+void ni_clear(struct ntfs_inode *ni);
+int ni_load_mi_ex(struct ntfs_inode *ni, CLST rno, struct mft_inode **mi);
+int ni_load_mi(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le,
+	       struct mft_inode **mi);
+struct ATTRIB *ni_find_attr(struct ntfs_inode *ni, struct ATTRIB *attr,
+			    struct ATTR_LIST_ENTRY **entry_o,
+			    enum ATTR_TYPE type, const __le16 *name,
+			    u8 name_len, const CLST *vcn,
+			    struct mft_inode **mi);
+struct ATTRIB *ni_enum_attr_ex(struct ntfs_inode *ni, struct ATTRIB *attr,
+			       struct ATTR_LIST_ENTRY **le,
+			       struct mft_inode **mi);
+struct ATTRIB *ni_load_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+			    const __le16 *name, u8 name_len, CLST vcn,
+			    struct mft_inode **pmi);
+int ni_load_all_mi(struct ntfs_inode *ni);
+bool ni_add_subrecord(struct ntfs_inode *ni, CLST rno, struct mft_inode **mi);
+int ni_remove_attr(struct ntfs_inode *ni, enum ATTR_TYPE type,
+		   const __le16 *name, size_t name_len, bool base_only,
+		   const __le16 *id);
+int ni_create_attr_list(struct ntfs_inode *ni);
+int ni_expand_list(struct ntfs_inode *ni);
+int ni_insert_nonresident(struct ntfs_inode *ni, enum ATTR_TYPE type,
+			  const __le16 *name, u8 name_len,
+			  const struct runs_tree *run, CLST svcn, CLST len,
+			  __le16 flags, struct ATTRIB **new_attr,
+			  struct mft_inode **mi);
+int ni_insert_resident(struct ntfs_inode *ni, u32 data_size,
+		       enum ATTR_TYPE type, const __le16 *name, u8 name_len,
+		       struct ATTRIB **new_attr, struct mft_inode **mi);
+int ni_remove_attr_le(struct ntfs_inode *ni, struct ATTRIB *attr,
+		      struct ATTR_LIST_ENTRY *le);
+int ni_delete_all(struct ntfs_inode *ni);
+struct ATTR_FILE_NAME *ni_fname_name(struct ntfs_inode *ni,
+				     const struct cpu_str *uni,
+				     const struct MFT_REF *home,
+				     struct ATTR_LIST_ENTRY **entry);
+struct ATTR_FILE_NAME *ni_fname_type(struct ntfs_inode *ni, u8 name_type,
+				     struct ATTR_LIST_ENTRY **entry);
+int ni_new_attr_flags(struct ntfs_inode *ni, enum FILE_ATTRIBUTE new_fa);
+enum REPARSE_SIGN ni_parse_reparse(struct ntfs_inode *ni, struct ATTRIB *attr,
+				   void *buffer);
+int ni_write_inode(struct inode *inode, int sync, const char *hint);
+#define _ni_write_inode(i, w) ni_write_inode(i, w, __func__)
+int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
+	      __u64 vbo, __u64 len);
+int ni_readpage_cmpr(struct ntfs_inode *ni, struct page *page);
+int ni_decompress_file(struct ntfs_inode *ni);
+int ni_read_frame(struct ntfs_inode *ni, u64 frame_vbo, struct page **pages,
+		  u32 pages_per_frame);
+int ni_write_frame(struct ntfs_inode *ni, struct page **pages,
+		   u32 pages_per_frame);
+
+/* globals from fslog.c */
+int log_replay(struct ntfs_inode *ni);
+
+/* globals from fsntfs.c */
+bool ntfs_fix_pre_write(struct NTFS_RECORD_HEADER *rhdr, size_t bytes);
+int ntfs_fix_post_read(struct NTFS_RECORD_HEADER *rhdr, size_t bytes,
+		       bool simple);
+int ntfs_extend_init(struct ntfs_sb_info *sbi);
+int ntfs_loadlog_and_replay(struct ntfs_inode *ni, struct ntfs_sb_info *sbi);
+const struct ATTR_DEF_ENTRY *ntfs_query_def(struct ntfs_sb_info *sbi,
+					    enum ATTR_TYPE Type);
+int ntfs_look_for_free_space(struct ntfs_sb_info *sbi, CLST lcn, CLST len,
+			     CLST *new_lcn, CLST *new_len,
+			     enum ALLOCATE_OPT opt);
+int ntfs_look_free_mft(struct ntfs_sb_info *sbi, CLST *rno, bool mft,
+		       struct ntfs_inode *ni, struct mft_inode **mi);
+void ntfs_mark_rec_free(struct ntfs_sb_info *sbi, CLST nRecord);
+int ntfs_clear_mft_tail(struct ntfs_sb_info *sbi, size_t from, size_t to);
+int ntfs_refresh_zone(struct ntfs_sb_info *sbi);
+int ntfs_update_mftmirr(struct ntfs_sb_info *sbi, int wait);
+enum NTFS_DIRTY_FLAGS {
+	NTFS_DIRTY_CLEAR = 0,
+	NTFS_DIRTY_DIRTY = 1,
+	NTFS_DIRTY_ERROR = 2,
+};
+int ntfs_set_state(struct ntfs_sb_info *sbi, enum NTFS_DIRTY_FLAGS dirty);
+int ntfs_sb_read(struct super_block *sb, u64 lbo, size_t bytes, void *buffer);
+int ntfs_sb_write(struct super_block *sb, u64 lbo, size_t bytes,
+		  const void *buffer, int wait);
+int ntfs_sb_write_run(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		      u64 vbo, const void *buf, size_t bytes);
+struct buffer_head *ntfs_bread_run(struct ntfs_sb_info *sbi,
+				   const struct runs_tree *run, u64 vbo);
+int ntfs_read_run_nb(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		     u64 vbo, void *buf, u32 bytes, struct ntfs_buffers *nb);
+int ntfs_read_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+		 struct NTFS_RECORD_HEADER *rhdr, u32 bytes,
+		 struct ntfs_buffers *nb);
+int ntfs_get_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+		u32 bytes, struct ntfs_buffers *nb);
+int ntfs_write_bh(struct ntfs_sb_info *sbi, struct NTFS_RECORD_HEADER *rhdr,
+		  struct ntfs_buffers *nb, int sync);
+int ntfs_bio_pages(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		   struct page **pages, u32 nr_pages, u64 vbo, u32 bytes,
+		   u32 op);
+int ntfs_bio_fill_1(struct ntfs_sb_info *sbi, const struct runs_tree *run);
+int ntfs_vbo_to_lbo(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		    u64 vbo, u64 *lbo, u64 *bytes);
+struct ntfs_inode *ntfs_new_inode(struct ntfs_sb_info *sbi, CLST nRec,
+				  bool dir);
+extern const u8 s_default_security[0x50];
+bool is_sd_valid(const struct SECURITY_DESCRIPTOR_RELATIVE *sd, u32 len);
+int ntfs_security_init(struct ntfs_sb_info *sbi);
+int ntfs_get_security_by_id(struct ntfs_sb_info *sbi, __le32 security_id,
+			    struct SECURITY_DESCRIPTOR_RELATIVE **sd,
+			    size_t *size);
+int ntfs_insert_security(struct ntfs_sb_info *sbi,
+			 const struct SECURITY_DESCRIPTOR_RELATIVE *sd,
+			 u32 size, __le32 *security_id, bool *inserted);
+int ntfs_reparse_init(struct ntfs_sb_info *sbi);
+int ntfs_objid_init(struct ntfs_sb_info *sbi);
+int ntfs_objid_remove(struct ntfs_sb_info *sbi, struct GUID *guid);
+int ntfs_insert_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+			const struct MFT_REF *ref);
+int ntfs_remove_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+			const struct MFT_REF *ref);
+void mark_as_free_ex(struct ntfs_sb_info *sbi, CLST lcn, CLST len, bool trim);
+int run_deallocate(struct ntfs_sb_info *sbi, struct runs_tree *run, bool trim);
+
+/* globals from index.c */
+int indx_used_bit(struct ntfs_index *indx, struct ntfs_inode *ni, size_t *bit);
+void fnd_clear(struct ntfs_fnd *fnd);
+static inline struct ntfs_fnd *fnd_get(void)
+{
+	return ntfs_zalloc(sizeof(struct ntfs_fnd));
+}
+static inline void fnd_put(struct ntfs_fnd *fnd)
+{
+	if (fnd) {
+		fnd_clear(fnd);
+		ntfs_free(fnd);
+	}
+}
+void indx_clear(struct ntfs_index *idx);
+int indx_init(struct ntfs_index *indx, struct ntfs_sb_info *sbi,
+	      const struct ATTRIB *attr, enum index_mutex_classed type);
+struct INDEX_ROOT *indx_get_root(struct ntfs_index *indx, struct ntfs_inode *ni,
+				 struct ATTRIB **attr, struct mft_inode **mi);
+int indx_read(struct ntfs_index *idx, struct ntfs_inode *ni, CLST vbn,
+	      struct indx_node **node);
+int indx_find(struct ntfs_index *indx, struct ntfs_inode *dir,
+	      const struct INDEX_ROOT *root, const void *Key, size_t KeyLen,
+	      const void *param, int *diff, struct NTFS_DE **entry,
+	      struct ntfs_fnd *fnd);
+int indx_find_sort(struct ntfs_index *indx, struct ntfs_inode *ni,
+		   const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+		   struct ntfs_fnd *fnd);
+int indx_find_raw(struct ntfs_index *indx, struct ntfs_inode *ni,
+		  const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+		  size_t *off, struct ntfs_fnd *fnd);
+int indx_insert_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+		      const struct NTFS_DE *new_de, const void *param,
+		      struct ntfs_fnd *fnd);
+int indx_delete_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+		      const void *key, u32 key_len, const void *param);
+int indx_update_dup(struct ntfs_inode *ni, struct ntfs_sb_info *sbi,
+		    const struct ATTR_FILE_NAME *fname,
+		    const struct NTFS_DUP_INFO *dup, int sync);
+
+/* globals from inode.c */
+struct inode *ntfs_iget5(struct super_block *sb, const struct MFT_REF *ref,
+			 const struct cpu_str *name);
+int ntfs_set_size(struct inode *inode, u64 new_size);
+int reset_log_file(struct inode *inode);
+int ntfs_get_block(struct inode *inode, sector_t vbn,
+		   struct buffer_head *bh_result, int create);
+int ntfs3_write_inode(struct inode *inode, struct writeback_control *wbc);
+int ntfs_sync_inode(struct inode *inode);
+int ntfs_flush_inodes(struct super_block *sb, struct inode *i1,
+		      struct inode *i2);
+int inode_write_data(struct inode *inode, const void *data, size_t bytes);
+int ntfs_create_inode(struct inode *dir, struct dentry *dentry,
+		      const struct cpu_str *uni, umode_t mode, dev_t dev,
+		      const char *symname, u32 size, int excl,
+		      struct ntfs_fnd *fnd, struct inode **new_inode);
+int ntfs_link_inode(struct inode *inode, struct dentry *dentry);
+int ntfs_unlink_inode(struct inode *dir, const struct dentry *dentry);
+void ntfs_evict_inode(struct inode *inode);
+extern const struct inode_operations ntfs_link_inode_operations;
+extern const struct address_space_operations ntfs_aops;
+extern const struct address_space_operations ntfs_aops_cmpr;
+
+/* globals from name_i.c*/
+int fill_name_de(struct ntfs_sb_info *sbi, void *buf, const struct qstr *name,
+		 const struct cpu_str *uni);
+struct dentry *ntfs3_get_parent(struct dentry *child);
+
+extern const struct inode_operations ntfs_dir_inode_operations;
+
+/* globals from record.c */
+int mi_get(struct ntfs_sb_info *sbi, CLST rno, struct mft_inode **mi);
+void mi_put(struct mft_inode *mi);
+int mi_init(struct mft_inode *mi, struct ntfs_sb_info *sbi, CLST rno);
+int mi_read(struct mft_inode *mi, bool is_mft);
+struct ATTRIB *mi_enum_attr(struct mft_inode *mi, struct ATTRIB *attr);
+// TODO: id?
+struct ATTRIB *mi_find_attr(struct mft_inode *mi, struct ATTRIB *attr,
+			    enum ATTR_TYPE type, const __le16 *name,
+			    size_t name_len, const __le16 *id);
+static inline struct ATTRIB *rec_find_attr_le(struct mft_inode *rec,
+					      struct ATTR_LIST_ENTRY *le)
+{
+	return mi_find_attr(rec, NULL, le->type, le_name(le), le->name_len,
+			    &le->id);
+}
+int mi_write(struct mft_inode *mi, int wait);
+int mi_format_new(struct mft_inode *mi, struct ntfs_sb_info *sbi, CLST rno,
+		  __le16 flags, bool is_mft);
+void mi_mark_free(struct mft_inode *mi);
+struct ATTRIB *mi_insert_attr(struct mft_inode *mi, enum ATTR_TYPE type,
+			      const __le16 *name, u8 name_len, u32 asize,
+			      u16 name_off);
+
+bool mi_remove_attr(struct mft_inode *mi, struct ATTRIB *attr);
+bool mi_resize_attr(struct mft_inode *mi, struct ATTRIB *attr, int bytes);
+int mi_pack_runs(struct mft_inode *mi, struct ATTRIB *attr,
+		 struct runs_tree *run, CLST len);
+static inline bool mi_is_ref(const struct mft_inode *mi,
+			     const struct MFT_REF *ref)
+{
+	if (le32_to_cpu(ref->low) != mi->rno)
+		return false;
+	if (ref->seq != mi->mrec->seq)
+		return false;
+
+#ifdef NTFS3_64BIT_CLUSTER
+	return le16_to_cpu(ref->high) == (mi->rno >> 32);
+#else
+	return !ref->high;
+#endif
+}
+
+/* globals from run.c */
+bool run_lookup_entry(const struct runs_tree *run, CLST vcn, CLST *lcn,
+		      CLST *len, size_t *index);
+void run_truncate(struct runs_tree *run, CLST vcn);
+void run_truncate_head(struct runs_tree *run, CLST vcn);
+void run_truncate_around(struct runs_tree *run, CLST vcn);
+bool run_lookup(const struct runs_tree *run, CLST vcn, size_t *Index);
+bool run_add_entry(struct runs_tree *run, CLST vcn, CLST lcn, CLST len,
+		   bool is_mft);
+bool run_collapse_range(struct runs_tree *run, CLST vcn, CLST len);
+bool run_get_entry(const struct runs_tree *run, size_t index, CLST *vcn,
+		   CLST *lcn, CLST *len);
+bool run_is_mapped_full(const struct runs_tree *run, CLST svcn, CLST evcn);
+
+int run_pack(const struct runs_tree *run, CLST svcn, CLST len, u8 *run_buf,
+	     u32 run_buf_size, CLST *packed_vcns);
+int run_unpack(struct runs_tree *run, struct ntfs_sb_info *sbi, CLST ino,
+	       CLST svcn, CLST evcn, CLST vcn, const u8 *run_buf,
+	       u32 run_buf_size);
+
+#ifdef NTFS3_CHECK_FREE_CLST
+int run_unpack_ex(struct runs_tree *run, struct ntfs_sb_info *sbi, CLST ino,
+		  CLST svcn, CLST evcn, CLST vcn, const u8 *run_buf,
+		  u32 run_buf_size);
+#else
+#define run_unpack_ex run_unpack
+#endif
+int run_get_highest_vcn(CLST vcn, const u8 *run_buf, u64 *highest_vcn);
+
+/* globals from super.c */
+void *ntfs_set_shared(void *ptr, u32 bytes);
+void *ntfs_put_shared(void *ptr);
+void ntfs_unmap_meta(struct super_block *sb, CLST lcn, CLST len);
+int ntfs_discard(struct ntfs_sb_info *sbi, CLST Lcn, CLST Len);
+
+/* globals from ubitmap.c*/
+void wnd_close(struct wnd_bitmap *wnd);
+static inline size_t wnd_zeroes(const struct wnd_bitmap *wnd)
+{
+	return wnd->total_zeroes;
+}
+int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits);
+int wnd_set_free(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+int wnd_set_used(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+bool wnd_is_free(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+bool wnd_is_used(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+
+/* Possible values for 'flags' 'wnd_find' */
+#define BITMAP_FIND_MARK_AS_USED 0x01
+#define BITMAP_FIND_FULL 0x02
+size_t wnd_find(struct wnd_bitmap *wnd, size_t to_alloc, size_t hint,
+		size_t flags, size_t *allocated);
+int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits);
+void wnd_zone_set(struct wnd_bitmap *wnd, size_t Lcn, size_t Len);
+int ntfs_trim_fs(struct ntfs_sb_info *sbi, struct fstrim_range *range);
+
+/* globals from upcase.c */
+int ntfs_cmp_names(const __le16 *s1, size_t l1, const __le16 *s2, size_t l2,
+		   const u16 *upcase, bool bothcase);
+int ntfs_cmp_names_cpu(const struct cpu_str *uni1, const struct le_str *uni2,
+		       const u16 *upcase, bool bothcase);
+
+/* globals from xattr.c */
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+struct posix_acl *ntfs_get_acl(struct inode *inode, int type);
+int ntfs_set_acl(struct inode *inode, struct posix_acl *acl, int type);
+int ntfs_init_acl(struct inode *inode, struct inode *dir);
+#else
+#define ntfs_get_acl NULL
+#define ntfs_set_acl NULL
+#endif
+
+int ntfs_acl_chmod(struct inode *inode);
+int ntfs_permission(struct inode *inode, int mask);
+ssize_t ntfs_listxattr(struct dentry *dentry, char *buffer, size_t size);
+extern const struct xattr_handler *ntfs_xattr_handlers[];
+
+/* globals from lznt.c */
+struct lznt *get_lznt_ctx(int level);
+size_t compress_lznt(const void *uncompressed, size_t uncompressed_size,
+		     void *compressed, size_t compressed_size,
+		     struct lznt *ctx);
+ssize_t decompress_lznt(const void *compressed, size_t compressed_size,
+			void *uncompressed, size_t uncompressed_size);
+
+static inline bool is_ntfs3(struct ntfs_sb_info *sbi)
+{
+	return sbi->volume.major_ver >= 3;
+}
+
+/*(sb->s_flags & SB_ACTIVE)*/
+static inline bool is_mounted(struct ntfs_sb_info *sbi)
+{
+	return !!sbi->sb->s_root;
+}
+
+static inline bool ntfs_is_meta_file(struct ntfs_sb_info *sbi, CLST rno)
+{
+	return rno < MFT_REC_FREE || rno == sbi->objid_no ||
+	       rno == sbi->quota_no || rno == sbi->reparse_no ||
+	       rno == sbi->usn_jrnl_no;
+}
+
+static inline void ntfs_unmap_page(struct page *page)
+{
+	kunmap(page);
+	put_page(page);
+}
+
+static inline struct page *ntfs_map_page(struct address_space *mapping,
+					 unsigned long index)
+{
+	struct page *page = read_mapping_page(mapping, index, NULL);
+
+	if (!IS_ERR(page)) {
+		kmap(page);
+		if (!PageError(page))
+			return page;
+		ntfs_unmap_page(page);
+		return ERR_PTR(-EIO);
+	}
+	return page;
+}
+
+static inline size_t wnd_zone_bit(const struct wnd_bitmap *wnd)
+{
+	return wnd->zone_bit;
+}
+
+static inline size_t wnd_zone_len(const struct wnd_bitmap *wnd)
+{
+	return wnd->zone_end - wnd->zone_bit;
+}
+
+static inline void run_init(struct runs_tree *run)
+{
+	run->runs_ = NULL;
+	run->count = 0;
+	run->allocated = 0;
+}
+
+static inline struct runs_tree *run_alloc(void)
+{
+	return ntfs_zalloc(sizeof(struct runs_tree));
+}
+
+static inline void run_close(struct runs_tree *run)
+{
+	ntfs_free(run->runs_);
+	memset(run, 0, sizeof(*run));
+}
+
+static inline void run_free(struct runs_tree *run)
+{
+	if (run) {
+		ntfs_free(run->runs_);
+		ntfs_free(run);
+	}
+}
+
+static inline bool run_is_empty(struct runs_tree *run)
+{
+	return !run->count;
+}
+
+/* NTFS uses quad aligned bitmaps */
+static inline size_t bitmap_size(size_t bits)
+{
+	return QuadAlign((bits + 7) >> 3);
+}
+
+#define _100ns2seconds 10000000
+#define SecondsToStartOf1970 0x00000002B6109100
+
+#define NTFS_TIME_GRAN 100
+
+/*
+ * kernel2nt
+ *
+ * converts in-memory kernel timestamp into nt time
+ */
+static inline __le64 kernel2nt(const struct timespec64 *ts)
+{
+	// 10^7 units of 100 nanoseconds one second
+	return cpu_to_le64(_100ns2seconds *
+				   (ts->tv_sec + SecondsToStartOf1970) +
+			   ts->tv_nsec / NTFS_TIME_GRAN);
+}
+
+/*
+ * nt2kernel
+ *
+ * converts on-disk nt time into kernel timestamp
+ */
+static inline void nt2kernel(const __le64 tm, struct timespec64 *ts)
+{
+	u64 t = le64_to_cpu(tm) - _100ns2seconds * SecondsToStartOf1970;
+
+	// WARNING: do_div changes its first argument(!)
+	ts->tv_nsec = do_div(t, _100ns2seconds) * 100;
+	ts->tv_sec = t;
+}
+
+static inline struct ntfs_sb_info *ntfs_sb(struct super_block *sb)
+{
+	return sb->s_fs_info;
+}
+
+/* Align up on cluster boundary */
+static inline u64 ntfs_up_cluster(const struct ntfs_sb_info *sbi, u64 size)
+{
+	return (size + sbi->cluster_mask) & sbi->cluster_mask_inv;
+}
+
+/* Align up on cluster boundary */
+static inline u64 ntfs_up_block(const struct super_block *sb, u64 size)
+{
+	return (size + sb->s_blocksize - 1) & ~(u64)(sb->s_blocksize - 1);
+}
+
+static inline CLST bytes_to_cluster(const struct ntfs_sb_info *sbi, u64 size)
+{
+	return (size + sbi->cluster_mask) >> sbi->cluster_bits;
+}
+
+static inline u64 bytes_to_block(const struct super_block *sb, u64 size)
+{
+	return (size + sb->s_blocksize - 1) >> sb->s_blocksize_bits;
+}
+
+static inline struct buffer_head *ntfs_bread(struct super_block *sb,
+					     sector_t block)
+{
+	struct buffer_head *bh;
+
+	bh = sb_bread(sb, block);
+	if (bh)
+		return bh;
+
+	ntfs_err(sb, "failed to read volume at offset 0x%llx",
+		 (u64)block << sb->s_blocksize_bits);
+	return NULL;
+}
+
+static inline bool is_power_of2(size_t v)
+{
+	return v && !(v & (v - 1));
+}
+
+static inline struct ntfs_inode *ntfs_i(struct inode *inode)
+{
+	return container_of(inode, struct ntfs_inode, vfs_inode);
+}
+
+static inline bool is_compressed(const struct ntfs_inode *ni)
+{
+	return (ni->std_fa & FILE_ATTRIBUTE_COMPRESSED) ||
+	       (ni->ni_flags & NI_FLAG_COMPRESSED_MASK);
+}
+
+static inline int ni_ext_compress_bits(const struct ntfs_inode *ni)
+{
+	return 0xb + (ni->ni_flags & NI_FLAG_COMPRESSED_MASK);
+}
+
+/* bits - 0xc, 0xd, 0xe, 0xf, 0x10 */
+static inline void ni_set_ext_compress_bits(struct ntfs_inode *ni, u8 bits)
+{
+	ni->ni_flags |= (bits - 0xb) & NI_FLAG_COMPRESSED_MASK;
+}
+
+static inline bool is_dedup(const struct ntfs_inode *ni)
+{
+	return ni->ni_flags & NI_FLAG_DEDUPLICATED;
+}
+
+static inline bool is_encrypted(const struct ntfs_inode *ni)
+{
+	return ni->std_fa & FILE_ATTRIBUTE_ENCRYPTED;
+}
+
+static inline bool is_sparsed(const struct ntfs_inode *ni)
+{
+	return ni->std_fa & FILE_ATTRIBUTE_SPARSE_FILE;
+}
+
+static inline int is_resident(struct ntfs_inode *ni)
+{
+	return ni->ni_flags & NI_FLAG_RESIDENT;
+}
+
+static inline void le16_sub_cpu(__le16 *var, u16 val)
+{
+	*var = cpu_to_le16(le16_to_cpu(*var) - val);
+}
+
+static inline void le32_sub_cpu(__le32 *var, u32 val)
+{
+	*var = cpu_to_le32(le32_to_cpu(*var) - val);
+}
+
+static inline void nb_put(struct ntfs_buffers *nb)
+{
+	u32 i, nbufs = nb->nbufs;
+
+	if (!nbufs)
+		return;
+
+	for (i = 0; i < nbufs; i++)
+		put_bh(nb->bh[i]);
+	nb->nbufs = 0;
+}
+
+static inline void put_indx_node(struct indx_node *in)
+{
+	if (!in)
+		return;
+
+	ntfs_free(in->index);
+	nb_put(&in->nb);
+	ntfs_free(in);
+}
+
+static inline void mi_clear(struct mft_inode *mi)
+{
+	nb_put(&mi->nb);
+	ntfs_free(mi->mrec);
+	mi->mrec = NULL;
+}
+
+static inline void ni_lock(struct ntfs_inode *ni)
+{
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_NORMAL);
+}
+
+static inline void ni_lock_dir(struct ntfs_inode *ni)
+{
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_PARENT);
+}
+
+static inline void ni_unlock(struct ntfs_inode *ni)
+{
+	mutex_unlock(&ni->ni_lock);
+}
+
+static inline int ni_trylock(struct ntfs_inode *ni)
+{
+	return mutex_trylock(&ni->ni_lock);
+}
+
+static inline int attr_load_runs_attr(struct ntfs_inode *ni,
+				      struct ATTRIB *attr,
+				      struct runs_tree *run, CLST vcn)
+{
+	return attr_load_runs_vcn(ni, attr->type, attr_name(attr),
+				  attr->name_len, run, vcn);
+}
+
+static inline void le64_sub_cpu(__le64 *var, u64 val)
+{
+	*var = cpu_to_le64(le64_to_cpu(*var) - val);
+}
diff --git a/fs/ntfs3/upcase.c b/fs/ntfs3/upcase.c
new file mode 100644
index 000000000000..57a03731b75a
--- /dev/null
+++ b/fs/ntfs3/upcase.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/module.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+static inline u16 upcase_unicode_char(const u16 *upcase, u16 chr)
+{
+	if (chr < 'a')
+		return chr;
+
+	if (chr <= 'z')
+		return chr - ('a' - 'A');
+
+	return upcase[chr];
+}
+
+/* Thanks Kari Argillander <kari.argillander@gmail.com> for idea and implementation 'bothcase' */
+int ntfs_cmp_names(const __le16 *s1, size_t l1, const __le16 *s2, size_t l2,
+		   const u16 *upcase, bool bothcase)
+{
+	int diff1 = 0;
+	int diff2;
+	size_t len = min(l1, l2);
+
+	if (!bothcase && upcase)
+		goto case_insentive;
+
+	for (; len; s1++, s2++, len--) {
+		diff1 = le16_to_cpu(*s1) - le16_to_cpu(*s2);
+		if (diff1) {
+			if (bothcase && upcase)
+				goto case_insentive;
+
+			return diff1;
+		}
+	}
+	return l1 - l2;
+
+case_insentive:
+	for (; len; s1++, s2++, len--) {
+		diff2 = upcase_unicode_char(upcase, le16_to_cpu(*s1)) -
+			upcase_unicode_char(upcase, le16_to_cpu(*s2));
+		if (diff2)
+			return diff2;
+	}
+
+	if (bothcase && diff1)
+		return diff1;
+
+	return l1 - l2;
+}
+
+int ntfs_cmp_names_cpu(const struct cpu_str *uni1, const struct le_str *uni2,
+		       const u16 *upcase, bool bothcase)
+{
+	const u16 *s1 = uni1->name;
+	const __le16 *s2 = uni2->name;
+	size_t l1 = uni1->len;
+	size_t l2 = uni2->len;
+	size_t len = min(l1, l2);
+	int diff1 = 0;
+	int diff2;
+
+	if (!bothcase && upcase)
+		goto case_insentive;
+
+	for (; len; s1++, s2++, len--) {
+		diff1 = *s1 - le16_to_cpu(*s2);
+		if (diff1) {
+			if (bothcase && upcase)
+				goto case_insentive;
+
+			return diff1;
+		}
+	}
+	return l1 - l2;
+
+case_insentive:
+	for (; len; s1++, s2++, len--) {
+		diff2 = upcase_unicode_char(upcase, *s1) -
+			upcase_unicode_char(upcase, le16_to_cpu(*s2));
+		if (diff2)
+			return diff2;
+	}
+
+	if (bothcase && diff1)
+		return diff1;
+
+	return l1 - l2;
+}

From patchwork Thu Jan 28 09:04:47 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053215
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B1350C43381
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:44 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 410B264DD1
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:44 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232163AbhA1JI2 (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:28 -0500
Received: from relayfre-01.paragon-software.com ([176.12.100.13]:47828 "EHLO
        relayfre-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231749AbhA1JGT (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:19 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relayfre-01.paragon-software.com (Postfix) with ESMTPS id
 373CF1E80;
        Thu, 28 Jan 2021 12:05:03 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824703;
        bh=v1nLxGoVBtbft/OKRDTLqnXhbGDZMOT3okjkJDJLLB4=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=i+UUN6Wj8S+3TmO1vqBpkSuWBReSZP/HbRMI4T2sYJs5JtMeT5SdS6LNvWstIeSYu
         r+ZfCXtBBwK7p/kMTU7GW1M8GgljjaIMqbttHbyynTVzKZaV/4G2qxLMKU639cpJGe
         6pryG0fSAqCOO0MU0eW5RF+2YaU8AgTP0fJswlYw=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:02 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 02/10] fs/ntfs3: Add initialization of super block
Date: Thu, 28 Jan 2021 12:04:47 +0300
Message-ID: 
 <20210128090455.3576502-3-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds initialization of super block

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/ntfs3/fsntfs.c | 2535 ++++++++++++++++++++++++++++++++++++++++++
 fs/ntfs3/index.c  | 2662 +++++++++++++++++++++++++++++++++++++++++++++
 fs/ntfs3/inode.c  | 2050 ++++++++++++++++++++++++++++++++++
 fs/ntfs3/super.c  | 1479 +++++++++++++++++++++++++
 4 files changed, 8726 insertions(+)
 create mode 100644 fs/ntfs3/fsntfs.c
 create mode 100644 fs/ntfs3/index.c
 create mode 100644 fs/ntfs3/inode.c
 create mode 100644 fs/ntfs3/super.c

diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
new file mode 100644
index 000000000000..eae477bd5c4a
--- /dev/null
+++ b/fs/ntfs3/fsntfs.c
@@ -0,0 +1,2535 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+// clang-format off
+const struct cpu_str NAME_MFT = {
+	4, 0, { '$', 'M', 'F', 'T' },
+};
+const struct cpu_str NAME_MIRROR = {
+	8, 0, { '$', 'M', 'F', 'T', 'M', 'i', 'r', 'r' },
+};
+const struct cpu_str NAME_LOGFILE = {
+	8, 0, { '$', 'L', 'o', 'g', 'F', 'i', 'l', 'e' },
+};
+const struct cpu_str NAME_VOLUME = {
+	7, 0, { '$', 'V', 'o', 'l', 'u', 'm', 'e' },
+};
+const struct cpu_str NAME_ATTRDEF = {
+	8, 0, { '$', 'A', 't', 't', 'r', 'D', 'e', 'f' },
+};
+const struct cpu_str NAME_ROOT = {
+	1, 0, { '.' },
+};
+const struct cpu_str NAME_BITMAP = {
+	7, 0, { '$', 'B', 'i', 't', 'm', 'a', 'p' },
+};
+const struct cpu_str NAME_BOOT = {
+	5, 0, { '$', 'B', 'o', 'o', 't' },
+};
+const struct cpu_str NAME_BADCLUS = {
+	8, 0, { '$', 'B', 'a', 'd', 'C', 'l', 'u', 's' },
+};
+const struct cpu_str NAME_QUOTA = {
+	6, 0, { '$', 'Q', 'u', 'o', 't', 'a' },
+};
+const struct cpu_str NAME_SECURE = {
+	7, 0, { '$', 'S', 'e', 'c', 'u', 'r', 'e' },
+};
+const struct cpu_str NAME_UPCASE = {
+	7, 0, { '$', 'U', 'p', 'C', 'a', 's', 'e' },
+};
+const struct cpu_str NAME_EXTEND = {
+	7, 0, { '$', 'E', 'x', 't', 'e', 'n', 'd' },
+};
+const struct cpu_str NAME_OBJID = {
+	6, 0, { '$', 'O', 'b', 'j', 'I', 'd' },
+};
+const struct cpu_str NAME_REPARSE = {
+	8, 0, { '$', 'R', 'e', 'p', 'a', 'r', 's', 'e' },
+};
+const struct cpu_str NAME_USNJRNL = {
+	8, 0, { '$', 'U', 's', 'n', 'J', 'r', 'n', 'l' },
+};
+const __le16 BAD_NAME[4] = {
+	cpu_to_le16('$'), cpu_to_le16('B'), cpu_to_le16('a'), cpu_to_le16('d'),
+};
+const __le16 I30_NAME[4] = {
+	cpu_to_le16('$'), cpu_to_le16('I'), cpu_to_le16('3'), cpu_to_le16('0'),
+};
+const __le16 SII_NAME[4] = {
+	cpu_to_le16('$'), cpu_to_le16('S'), cpu_to_le16('I'), cpu_to_le16('I'),
+};
+const __le16 SDH_NAME[4] = {
+	cpu_to_le16('$'), cpu_to_le16('S'), cpu_to_le16('D'), cpu_to_le16('H'),
+};
+const __le16 SDS_NAME[4] = {
+	cpu_to_le16('$'), cpu_to_le16('S'), cpu_to_le16('D'), cpu_to_le16('S'),
+};
+const __le16 SO_NAME[2] = {
+	cpu_to_le16('$'), cpu_to_le16('O'),
+};
+const __le16 SQ_NAME[2] = {
+	cpu_to_le16('$'), cpu_to_le16('Q'),
+};
+const __le16 SR_NAME[2] = {
+	cpu_to_le16('$'), cpu_to_le16('R'),
+};
+
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+const __le16 WOF_NAME[17] = {
+	cpu_to_le16('W'), cpu_to_le16('o'), cpu_to_le16('f'), cpu_to_le16('C'),
+	cpu_to_le16('o'), cpu_to_le16('m'), cpu_to_le16('p'), cpu_to_le16('r'),
+	cpu_to_le16('e'), cpu_to_le16('s'), cpu_to_le16('s'), cpu_to_le16('e'),
+	cpu_to_le16('d'), cpu_to_le16('D'), cpu_to_le16('a'), cpu_to_le16('t'),
+	cpu_to_le16('a'),
+};
+#endif
+
+// clang-format on
+
+/*
+ * ntfs_fix_pre_write
+ *
+ * inserts fixups into 'rhdr' before writing to disk
+ */
+bool ntfs_fix_pre_write(struct NTFS_RECORD_HEADER *rhdr, size_t bytes)
+{
+	u16 *fixup, *ptr;
+	u16 sample;
+	u16 fo = le16_to_cpu(rhdr->fix_off);
+	u16 fn = le16_to_cpu(rhdr->fix_num);
+
+	if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+	    fn * SECTOR_SIZE > bytes) {
+		return false;
+	}
+
+	/* Get fixup pointer */
+	fixup = Add2Ptr(rhdr, fo);
+
+	if (*fixup >= 0x7FFF)
+		*fixup = 1;
+	else
+		*fixup += 1;
+
+	sample = *fixup;
+
+	ptr = Add2Ptr(rhdr, SECTOR_SIZE - sizeof(short));
+
+	while (fn--) {
+		*++fixup = *ptr;
+		*ptr = sample;
+		ptr += SECTOR_SIZE / sizeof(short);
+	}
+	return true;
+}
+
+/*
+ * ntfs_fix_post_read
+ *
+ * remove fixups after reading from disk
+ * Returns < 0 if error, 0 if ok, 1 if need to update fixups
+ */
+int ntfs_fix_post_read(struct NTFS_RECORD_HEADER *rhdr, size_t bytes,
+		       bool simple)
+{
+	int ret;
+	u16 *fixup, *ptr;
+	u16 sample, fo, fn;
+
+	fo = le16_to_cpu(rhdr->fix_off);
+	fn = simple ? ((bytes >> SECTOR_SHIFT) + 1) :
+		      le16_to_cpu(rhdr->fix_num);
+
+	/* Check errors */
+	if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+	    fn * SECTOR_SIZE > bytes) {
+		return -EINVAL; /* native chkntfs returns ok! */
+	}
+
+	/* Get fixup pointer */
+	fixup = Add2Ptr(rhdr, fo);
+	sample = *fixup;
+	ptr = Add2Ptr(rhdr, SECTOR_SIZE - sizeof(short));
+	ret = 0;
+
+	while (fn--) {
+		/* Test current word */
+		if (*ptr != sample) {
+			/* Fixup does not match! Is it serious error? */
+			ret = -E_NTFS_FIXUP;
+		}
+
+		/* Replace fixup */
+		*ptr = *++fixup;
+		ptr += SECTOR_SIZE / sizeof(short);
+	}
+
+	return ret;
+}
+
+/*
+ * ntfs_extend_init
+ *
+ * loads $Extend file
+ */
+int ntfs_extend_init(struct ntfs_sb_info *sbi)
+{
+	int err;
+	struct super_block *sb = sbi->sb;
+	struct inode *inode, *inode2;
+	struct MFT_REF ref;
+
+	if (sbi->volume.major_ver < 3) {
+		ntfs_notice(sb, "Skip $Extend 'cause NTFS version");
+		return 0;
+	}
+
+	ref.low = cpu_to_le32(MFT_REC_EXTEND);
+	ref.high = 0;
+	ref.seq = cpu_to_le16(MFT_REC_EXTEND);
+	inode = ntfs_iget5(sb, &ref, &NAME_EXTEND);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $Extend.");
+		inode = NULL;
+		goto out;
+	}
+
+	/* if ntfs_iget5 reads from disk it never returns bad inode */
+	if (!S_ISDIR(inode->i_mode)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Try to find $ObjId */
+	inode2 = dir_search_u(inode, &NAME_OBJID, NULL);
+	if (inode2 && !IS_ERR(inode2)) {
+		if (is_bad_inode(inode2)) {
+			iput(inode2);
+		} else {
+			sbi->objid.ni = ntfs_i(inode2);
+			sbi->objid_no = inode2->i_ino;
+		}
+	}
+
+	/* Try to find $Quota */
+	inode2 = dir_search_u(inode, &NAME_QUOTA, NULL);
+	if (inode2 && !IS_ERR(inode2)) {
+		sbi->quota_no = inode2->i_ino;
+		iput(inode2);
+	}
+
+	/* Try to find $Reparse */
+	inode2 = dir_search_u(inode, &NAME_REPARSE, NULL);
+	if (inode2 && !IS_ERR(inode2)) {
+		sbi->reparse.ni = ntfs_i(inode2);
+		sbi->reparse_no = inode2->i_ino;
+	}
+
+	/* Try to find $UsnJrnl */
+	inode2 = dir_search_u(inode, &NAME_USNJRNL, NULL);
+	if (inode2 && !IS_ERR(inode2)) {
+		sbi->usn_jrnl_no = inode2->i_ino;
+		iput(inode2);
+	}
+
+	err = 0;
+out:
+	iput(inode);
+	return err;
+}
+
+int ntfs_loadlog_and_replay(struct ntfs_inode *ni, struct ntfs_sb_info *sbi)
+{
+	int err = 0;
+	struct super_block *sb = sbi->sb;
+	struct inode *inode = &ni->vfs_inode;
+	struct MFT_REF ref;
+
+	/* Check for 4GB */
+	if (inode->i_size >= 0x100000000ull) {
+		ntfs_err(sb, "$LogFile is too big");
+		err = -EINVAL;
+		goto out;
+	}
+
+	sbi->flags |= NTFS_FLAGS_LOG_REPLAYING;
+
+	ref.low = cpu_to_le32(MFT_REC_MFT);
+	ref.high = 0;
+	ref.seq = cpu_to_le16(1);
+
+	inode = ntfs_iget5(sb, &ref, NULL);
+
+	if (IS_ERR(inode))
+		inode = NULL;
+
+	if (!inode) {
+		/* Try to use mft copy */
+		u64 t64 = sbi->mft.lbo;
+
+		sbi->mft.lbo = sbi->mft.lbo2;
+		inode = ntfs_iget5(sb, &ref, NULL);
+		sbi->mft.lbo = t64;
+		if (IS_ERR(inode))
+			inode = NULL;
+	}
+
+	if (!inode) {
+		err = -EINVAL;
+		ntfs_err(sb, "Failed to load $MFT.");
+		goto out;
+	}
+
+	sbi->mft.ni = ntfs_i(inode);
+
+	err = ni_load_all_mi(sbi->mft.ni);
+	if (!err)
+		err = log_replay(ni);
+
+	iput(inode);
+	sbi->mft.ni = NULL;
+
+	sync_blockdev(sb->s_bdev);
+	invalidate_bdev(sb->s_bdev);
+
+	if (sbi->flags & NTFS_FLAGS_NEED_REPLAY) {
+		err = 0;
+		goto out;
+	}
+
+	if (sb_rdonly(sb))
+		goto out;
+
+	err = ntfs_bio_fill_1(sbi, &ni->file.run);
+
+out:
+	sbi->flags &= ~NTFS_FLAGS_LOG_REPLAYING;
+
+	return err;
+}
+
+/*
+ * ntfs_query_def
+ *
+ * returns current ATTR_DEF_ENTRY for given attribute type
+ */
+const struct ATTR_DEF_ENTRY *ntfs_query_def(struct ntfs_sb_info *sbi,
+					    enum ATTR_TYPE type)
+{
+	int type_in = le32_to_cpu(type);
+	size_t min_idx = 0;
+	size_t max_idx = sbi->def_entries - 1;
+
+	while (min_idx <= max_idx) {
+		size_t i = min_idx + ((max_idx - min_idx) >> 1);
+		const struct ATTR_DEF_ENTRY *entry = sbi->def_table + i;
+		int diff = le32_to_cpu(entry->type) - type_in;
+
+		if (!diff)
+			return entry;
+		if (diff < 0)
+			min_idx = i + 1;
+		else if (i)
+			max_idx = i - 1;
+		else
+			return NULL;
+	}
+	return NULL;
+}
+
+/*
+ * ntfs_look_for_free_space
+ *
+ * looks for a free space in bitmap
+ */
+int ntfs_look_for_free_space(struct ntfs_sb_info *sbi, CLST lcn, CLST len,
+			     CLST *new_lcn, CLST *new_len,
+			     enum ALLOCATE_OPT opt)
+{
+	int err;
+	struct super_block *sb = sbi->sb;
+	size_t a_lcn, zlen, zeroes, zlcn, zlen2, ztrim, new_zlen;
+	struct wnd_bitmap *wnd = &sbi->used.bitmap;
+
+	down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+	if (opt & ALLOCATE_MFT) {
+		CLST alen;
+
+		zlen = wnd_zone_len(wnd);
+
+		if (!zlen) {
+			err = ntfs_refresh_zone(sbi);
+			if (err)
+				goto out;
+
+			zlen = wnd_zone_len(wnd);
+
+			if (!zlen) {
+				ntfs_err(sbi->sb,
+					 "no free space to extend mft");
+				err = -ENOSPC;
+				goto out;
+			}
+		}
+
+		lcn = wnd_zone_bit(wnd);
+		alen = zlen > len ? len : zlen;
+
+		wnd_zone_set(wnd, lcn + alen, zlen - alen);
+
+		err = wnd_set_used(wnd, lcn, alen);
+		if (err)
+			goto out;
+
+		*new_lcn = lcn;
+		*new_len = alen;
+		goto ok;
+	}
+
+	/*
+	 * 'Cause cluster 0 is always used this value means that we should use
+	 * cached value of 'next_free_lcn' to improve performance
+	 */
+	if (!lcn)
+		lcn = sbi->used.next_free_lcn;
+
+	if (lcn >= wnd->nbits)
+		lcn = 0;
+
+	*new_len = wnd_find(wnd, len, lcn, BITMAP_FIND_MARK_AS_USED, &a_lcn);
+	if (*new_len) {
+		*new_lcn = a_lcn;
+		goto ok;
+	}
+
+	/* Try to use clusters from MftZone */
+	zlen = wnd_zone_len(wnd);
+	zeroes = wnd_zeroes(wnd);
+
+	/* Check too big request */
+	if (len > zeroes + zlen)
+		goto no_space;
+
+	if (zlen <= NTFS_MIN_MFT_ZONE)
+		goto no_space;
+
+	/* How many clusters to cat from zone */
+	zlcn = wnd_zone_bit(wnd);
+	zlen2 = zlen >> 1;
+	ztrim = len > zlen ? zlen : (len > zlen2 ? len : zlen2);
+	new_zlen = zlen - ztrim;
+
+	if (new_zlen < NTFS_MIN_MFT_ZONE) {
+		new_zlen = NTFS_MIN_MFT_ZONE;
+		if (new_zlen > zlen)
+			new_zlen = zlen;
+	}
+
+	wnd_zone_set(wnd, zlcn, new_zlen);
+
+	/* allocate continues clusters */
+	*new_len =
+		wnd_find(wnd, len, 0,
+			 BITMAP_FIND_MARK_AS_USED | BITMAP_FIND_FULL, &a_lcn);
+	if (*new_len) {
+		*new_lcn = a_lcn;
+		goto ok;
+	}
+
+no_space:
+	up_write(&wnd->rw_lock);
+
+	return -ENOSPC;
+
+ok:
+	err = 0;
+
+	ntfs_unmap_meta(sb, *new_lcn, *new_len);
+
+	if (opt & ALLOCATE_MFT)
+		goto out;
+
+	/* Set hint for next requests */
+	sbi->used.next_free_lcn = *new_lcn + *new_len;
+
+out:
+	up_write(&wnd->rw_lock);
+	return err;
+}
+
+/*
+ * ntfs_extend_mft
+ *
+ * allocates additional MFT records
+ * sbi->mft.bitmap is locked for write
+ *
+ * NOTE: recursive:
+ *	ntfs_look_free_mft ->
+ *	ntfs_extend_mft ->
+ *	attr_set_size ->
+ *	ni_insert_nonresident ->
+ *	ni_insert_attr ->
+ *	ni_ins_attr_ext ->
+ *	ntfs_look_free_mft ->
+ *	ntfs_extend_mft
+ * To avoid recursive always allocate space for two new mft records
+ * see attrib.c: "at least two mft to avoid recursive loop"
+ */
+static int ntfs_extend_mft(struct ntfs_sb_info *sbi)
+{
+	int err;
+	struct ntfs_inode *ni = sbi->mft.ni;
+	size_t new_mft_total;
+	u64 new_mft_bytes, new_bitmap_bytes;
+	struct ATTRIB *attr;
+	struct wnd_bitmap *wnd = &sbi->mft.bitmap;
+
+	new_mft_total = (wnd->nbits + MFT_INCREASE_CHUNK + 127) & (CLST)~127;
+	new_mft_bytes = (u64)new_mft_total << sbi->record_bits;
+
+	/* Step 1: Resize $MFT::DATA */
+	down_write(&ni->file.run_lock);
+	err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run,
+			    new_mft_bytes, NULL, false, &attr);
+
+	if (err) {
+		up_write(&ni->file.run_lock);
+		goto out;
+	}
+
+	attr->nres.valid_size = attr->nres.data_size;
+	new_mft_total = le64_to_cpu(attr->nres.alloc_size) >> sbi->record_bits;
+	ni->mi.dirty = true;
+
+	/* Step 2: Resize $MFT::BITMAP */
+	new_bitmap_bytes = bitmap_size(new_mft_total);
+
+	err = attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run,
+			    new_bitmap_bytes, &new_bitmap_bytes, true, NULL);
+
+	/* Refresh Mft Zone if necessary */
+	down_write_nested(&sbi->used.bitmap.rw_lock, BITMAP_MUTEX_CLUSTERS);
+
+	ntfs_refresh_zone(sbi);
+
+	up_write(&sbi->used.bitmap.rw_lock);
+	up_write(&ni->file.run_lock);
+
+	if (err)
+		goto out;
+
+	err = wnd_extend(wnd, new_mft_total);
+
+	if (err)
+		goto out;
+
+	ntfs_clear_mft_tail(sbi, sbi->mft.used, new_mft_total);
+
+	err = _ni_write_inode(&ni->vfs_inode, 0);
+out:
+	return err;
+}
+
+/*
+ * ntfs_look_free_mft
+ *
+ * looks for a free MFT record
+ */
+int ntfs_look_free_mft(struct ntfs_sb_info *sbi, CLST *rno, bool mft,
+		       struct ntfs_inode *ni, struct mft_inode **mi)
+{
+	int err = 0;
+	size_t zbit, zlen, from, to, fr;
+	size_t mft_total;
+	struct MFT_REF ref;
+	struct super_block *sb = sbi->sb;
+	struct wnd_bitmap *wnd = &sbi->mft.bitmap;
+	u32 ir;
+
+	static_assert(sizeof(sbi->mft.reserved_bitmap) * 8 >=
+		      MFT_REC_FREE - MFT_REC_RESERVED);
+
+	if (!mft)
+		down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_MFT);
+
+	zlen = wnd_zone_len(wnd);
+
+	/* Always reserve space for MFT */
+	if (zlen) {
+		if (mft) {
+			zbit = wnd_zone_bit(wnd);
+			*rno = zbit;
+			wnd_zone_set(wnd, zbit + 1, zlen - 1);
+		}
+		goto found;
+	}
+
+	/* No MFT zone. find the nearest to '0' free MFT */
+	if (!wnd_find(wnd, 1, MFT_REC_FREE, 0, &zbit)) {
+		/* Resize MFT */
+		mft_total = wnd->nbits;
+
+		err = ntfs_extend_mft(sbi);
+		if (!err) {
+			zbit = mft_total;
+			goto reserve_mft;
+		}
+
+		if (!mft || MFT_REC_FREE == sbi->mft.next_reserved)
+			goto out;
+
+		err = 0;
+
+		/*
+		 * Look for free record reserved area [11-16) ==
+		 * [MFT_REC_RESERVED, MFT_REC_FREE ) MFT bitmap always
+		 * marks it as used
+		 */
+		if (!sbi->mft.reserved_bitmap) {
+			/* Once per session create internal bitmap for 5 bits */
+			sbi->mft.reserved_bitmap = 0xFF;
+
+			ref.high = 0;
+			for (ir = MFT_REC_RESERVED; ir < MFT_REC_FREE; ir++) {
+				struct inode *i;
+				struct ntfs_inode *ni;
+				struct MFT_REC *mrec;
+
+				ref.low = cpu_to_le32(ir);
+				ref.seq = cpu_to_le16(ir);
+
+				i = ntfs_iget5(sb, &ref, NULL);
+				if (IS_ERR(i)) {
+next:
+					ntfs_notice(
+						sb,
+						"Invalid reserved record %x",
+						ref.low);
+					continue;
+				}
+				if (is_bad_inode(i)) {
+					iput(i);
+					goto next;
+				}
+
+				ni = ntfs_i(i);
+
+				mrec = ni->mi.mrec;
+
+				if (!is_rec_base(mrec))
+					goto next;
+
+				if (mrec->hard_links)
+					goto next;
+
+				if (!ni_std(ni))
+					goto next;
+
+				if (ni_find_attr(ni, NULL, NULL, ATTR_NAME,
+						 NULL, 0, NULL, NULL))
+					goto next;
+
+				__clear_bit(ir - MFT_REC_RESERVED,
+					    &sbi->mft.reserved_bitmap);
+			}
+		}
+
+		/* Scan 5 bits for zero. Bit 0 == MFT_REC_RESERVED */
+		zbit = find_next_zero_bit(&sbi->mft.reserved_bitmap,
+					  MFT_REC_FREE, MFT_REC_RESERVED);
+		if (zbit >= MFT_REC_FREE) {
+			sbi->mft.next_reserved = MFT_REC_FREE;
+			goto out;
+		}
+
+		zlen = 1;
+		sbi->mft.next_reserved = zbit;
+	} else {
+reserve_mft:
+		zlen = zbit == MFT_REC_FREE ? (MFT_REC_USER - MFT_REC_FREE) : 4;
+		if (zbit + zlen > wnd->nbits)
+			zlen = wnd->nbits - zbit;
+
+		while (zlen > 1 && !wnd_is_free(wnd, zbit, zlen))
+			zlen -= 1;
+
+		/* [zbit, zbit + zlen) will be used for Mft itself */
+		from = sbi->mft.used;
+		if (from < zbit)
+			from = zbit;
+		to = zbit + zlen;
+		if (from < to) {
+			ntfs_clear_mft_tail(sbi, from, to);
+			sbi->mft.used = to;
+		}
+	}
+
+	if (mft) {
+		*rno = zbit;
+		zbit += 1;
+		zlen -= 1;
+	}
+
+	wnd_zone_set(wnd, zbit, zlen);
+
+found:
+	if (!mft) {
+		/* The request to get record for general purpose */
+		if (sbi->mft.next_free < MFT_REC_USER)
+			sbi->mft.next_free = MFT_REC_USER;
+
+		for (;;) {
+			if (sbi->mft.next_free >= sbi->mft.bitmap.nbits) {
+			} else if (!wnd_find(wnd, 1, MFT_REC_USER, 0, &fr)) {
+				sbi->mft.next_free = sbi->mft.bitmap.nbits;
+			} else {
+				*rno = fr;
+				sbi->mft.next_free = *rno + 1;
+				break;
+			}
+
+			err = ntfs_extend_mft(sbi);
+			if (err)
+				goto out;
+		}
+	}
+
+	if (ni && !ni_add_subrecord(ni, *rno, mi)) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	/* We have found a record that are not reserved for next MFT */
+	if (*rno >= MFT_REC_FREE)
+		wnd_set_used(wnd, *rno, 1);
+	else if (*rno >= MFT_REC_RESERVED && sbi->mft.reserved_bitmap_inited)
+		__set_bit(*rno - MFT_REC_RESERVED, &sbi->mft.reserved_bitmap);
+
+out:
+	if (!mft)
+		up_write(&wnd->rw_lock);
+
+	return err;
+}
+
+/*
+ * ntfs_mark_rec_free
+ *
+ * marks record as free
+ */
+void ntfs_mark_rec_free(struct ntfs_sb_info *sbi, CLST rno)
+{
+	struct wnd_bitmap *wnd = &sbi->mft.bitmap;
+
+	down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_MFT);
+	if (rno >= wnd->nbits)
+		goto out;
+
+	if (rno >= MFT_REC_FREE) {
+		if (!wnd_is_used(wnd, rno, 1))
+			ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+		else
+			wnd_set_free(wnd, rno, 1);
+	} else if (rno >= MFT_REC_RESERVED && sbi->mft.reserved_bitmap_inited) {
+		__clear_bit(rno - MFT_REC_RESERVED, &sbi->mft.reserved_bitmap);
+	}
+
+	if (rno < wnd_zone_bit(wnd))
+		wnd_zone_set(wnd, rno, 1);
+	else if (rno < sbi->mft.next_free && rno >= MFT_REC_USER)
+		sbi->mft.next_free = rno;
+
+out:
+	up_write(&wnd->rw_lock);
+}
+
+/*
+ * ntfs_clear_mft_tail
+ *
+ * formats empty records [from, to)
+ * sbi->mft.bitmap is locked for write
+ */
+int ntfs_clear_mft_tail(struct ntfs_sb_info *sbi, size_t from, size_t to)
+{
+	int err;
+	u32 rs;
+	u64 vbo;
+	struct runs_tree *run;
+	struct ntfs_inode *ni;
+
+	if (from >= to)
+		return 0;
+
+	rs = sbi->record_size;
+	ni = sbi->mft.ni;
+	run = &ni->file.run;
+
+	down_read(&ni->file.run_lock);
+	vbo = (u64)from * rs;
+	for (; from < to; from++, vbo += rs) {
+		struct ntfs_buffers nb;
+
+		err = ntfs_get_bh(sbi, run, vbo, rs, &nb);
+		if (err)
+			goto out;
+
+		err = ntfs_write_bh(sbi, &sbi->new_rec->rhdr, &nb, 0);
+		nb_put(&nb);
+		if (err)
+			goto out;
+	}
+
+out:
+	sbi->mft.used = from;
+	up_read(&ni->file.run_lock);
+	return err;
+}
+
+/*
+ * ntfs_refresh_zone
+ *
+ * refreshes Mft zone
+ * sbi->used.bitmap is locked for rw
+ * sbi->mft.bitmap is locked for write
+ * sbi->mft.ni->file.run_lock for write
+ */
+int ntfs_refresh_zone(struct ntfs_sb_info *sbi)
+{
+	CLST zone_limit, zone_max, lcn, vcn, len;
+	size_t lcn_s, zlen;
+	struct wnd_bitmap *wnd = &sbi->used.bitmap;
+	struct ntfs_inode *ni = sbi->mft.ni;
+
+	/* Do not change anything unless we have non empty Mft zone */
+	if (wnd_zone_len(wnd))
+		return 0;
+
+	/*
+	 * Compute the mft zone at two steps
+	 * It would be nice if we are able to allocate
+	 * 1/8 of total clusters for MFT but not more then 512 MB
+	 */
+	zone_limit = (512 * 1024 * 1024) >> sbi->cluster_bits;
+	zone_max = wnd->nbits >> 3;
+	if (zone_max > zone_limit)
+		zone_max = zone_limit;
+
+	vcn = bytes_to_cluster(sbi,
+			       (u64)sbi->mft.bitmap.nbits << sbi->record_bits);
+
+	if (!run_lookup_entry(&ni->file.run, vcn - 1, &lcn, &len, NULL))
+		lcn = SPARSE_LCN;
+
+	/* We should always find Last Lcn for MFT */
+	if (lcn == SPARSE_LCN)
+		return -EINVAL;
+
+	lcn_s = lcn + 1;
+
+	/* Try to allocate clusters after last MFT run */
+	zlen = wnd_find(wnd, zone_max, lcn_s, 0, &lcn_s);
+	if (!zlen) {
+		ntfs_notice(sbi->sb, "MftZone: unavailable");
+		return 0;
+	}
+
+	/* Truncate too large zone */
+	wnd_zone_set(wnd, lcn_s, zlen);
+
+	return 0;
+}
+
+/*
+ * ntfs_update_mftmirr
+ *
+ * updates $MFTMirr data
+ */
+int ntfs_update_mftmirr(struct ntfs_sb_info *sbi, int wait)
+{
+	int err;
+	struct super_block *sb = sbi->sb;
+	u32 blocksize = sb->s_blocksize;
+	sector_t block1, block2;
+	u32 bytes;
+
+	if (!(sbi->flags & NTFS_FLAGS_MFTMIRR))
+		return 0;
+
+	err = 0;
+	bytes = sbi->mft.recs_mirr << sbi->record_bits;
+	block1 = sbi->mft.lbo >> sb->s_blocksize_bits;
+	block2 = sbi->mft.lbo2 >> sb->s_blocksize_bits;
+
+	for (; bytes >= blocksize; bytes -= blocksize) {
+		struct buffer_head *bh1, *bh2;
+
+		bh1 = sb_bread(sb, block1++);
+		if (!bh1) {
+			err = -EIO;
+			goto out;
+		}
+
+		bh2 = sb_getblk(sb, block2++);
+		if (!bh2) {
+			put_bh(bh1);
+			err = -EIO;
+			goto out;
+		}
+
+		if (buffer_locked(bh2))
+			__wait_on_buffer(bh2);
+
+		lock_buffer(bh2);
+		memcpy(bh2->b_data, bh1->b_data, blocksize);
+		set_buffer_uptodate(bh2);
+		mark_buffer_dirty(bh2);
+		unlock_buffer(bh2);
+
+		put_bh(bh1);
+		bh1 = NULL;
+
+		if (wait)
+			err = sync_dirty_buffer(bh2);
+
+		put_bh(bh2);
+		if (err)
+			goto out;
+	}
+
+	sbi->flags &= ~NTFS_FLAGS_MFTMIRR;
+
+out:
+	return err;
+}
+
+/*
+ * ntfs_set_state
+ *
+ * mount: ntfs_set_state(NTFS_DIRTY_DIRTY)
+ * umount: ntfs_set_state(NTFS_DIRTY_CLEAR)
+ * ntfs error: ntfs_set_state(NTFS_DIRTY_ERROR)
+ */
+int ntfs_set_state(struct ntfs_sb_info *sbi, enum NTFS_DIRTY_FLAGS dirty)
+{
+	int err;
+	struct ATTRIB *attr;
+	struct VOLUME_INFO *info;
+	struct mft_inode *mi;
+	struct ntfs_inode *ni;
+
+	/*
+	 * do not change state if fs was real_dirty
+	 * do not change state if fs already dirty(clear)
+	 * do not change any thing if mounted read only
+	 */
+	if (sbi->volume.real_dirty || sb_rdonly(sbi->sb))
+		return 0;
+
+	/* Check cached value */
+	if ((dirty == NTFS_DIRTY_CLEAR ? 0 : VOLUME_FLAG_DIRTY) ==
+	    (sbi->volume.flags & VOLUME_FLAG_DIRTY))
+		return 0;
+
+	ni = sbi->volume.ni;
+	if (!ni)
+		return -EINVAL;
+
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_DIRTY);
+
+	attr = ni_find_attr(ni, NULL, NULL, ATTR_VOL_INFO, NULL, 0, NULL, &mi);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	info = resident_data_ex(attr, SIZEOF_ATTRIBUTE_VOLUME_INFO);
+	if (!info) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	switch (dirty) {
+	case NTFS_DIRTY_ERROR:
+		ntfs_notice(sbi->sb, "Mark volume as dirty due to NTFS errors");
+		sbi->volume.real_dirty = true;
+		fallthrough;
+	case NTFS_DIRTY_DIRTY:
+		info->flags |= VOLUME_FLAG_DIRTY;
+		break;
+	case NTFS_DIRTY_CLEAR:
+		info->flags &= ~VOLUME_FLAG_DIRTY;
+		break;
+	}
+	/* cache current volume flags*/
+	sbi->volume.flags = info->flags;
+	mi->dirty = true;
+	err = 0;
+
+out:
+	ni_unlock(ni);
+	if (err)
+		return err;
+
+	mark_inode_dirty(&ni->vfs_inode);
+	/*verify(!ntfs_update_mftmirr()); */
+	err = sync_inode_metadata(&ni->vfs_inode, 1);
+
+	return err;
+}
+
+/*
+ * security_hash
+ *
+ * calculates a hash of security descriptor
+ */
+static inline __le32 security_hash(const void *sd, size_t bytes)
+{
+	u32 hash = 0;
+	const __le32 *ptr = sd;
+
+	bytes >>= 2;
+	while (bytes--)
+		hash = ((hash >> 0x1D) | (hash << 3)) + le32_to_cpu(*ptr++);
+	return cpu_to_le32(hash);
+}
+
+int ntfs_sb_read(struct super_block *sb, u64 lbo, size_t bytes, void *buffer)
+{
+	struct block_device *bdev = sb->s_bdev;
+	u32 blocksize = sb->s_blocksize;
+	u64 block = lbo >> sb->s_blocksize_bits;
+	u32 off = lbo & (blocksize - 1);
+	u32 op = blocksize - off;
+
+	for (; bytes; block += 1, off = 0, op = blocksize) {
+		struct buffer_head *bh = __bread(bdev, block, blocksize);
+
+		if (!bh)
+			return -EIO;
+
+		if (op > bytes)
+			op = bytes;
+
+		memcpy(buffer, bh->b_data + off, op);
+
+		put_bh(bh);
+
+		bytes -= op;
+		buffer = Add2Ptr(buffer, op);
+	}
+
+	return 0;
+}
+
+int ntfs_sb_write(struct super_block *sb, u64 lbo, size_t bytes,
+		  const void *buf, int wait)
+{
+	u32 blocksize = sb->s_blocksize;
+	struct block_device *bdev = sb->s_bdev;
+	sector_t block = lbo >> sb->s_blocksize_bits;
+	u32 off = lbo & (blocksize - 1);
+	u32 op = blocksize - off;
+	struct buffer_head *bh;
+
+	if (!wait && (sb->s_flags & SB_SYNCHRONOUS))
+		wait = 1;
+
+	for (; bytes; block += 1, off = 0, op = blocksize) {
+		if (op > bytes)
+			op = bytes;
+
+		if (op < blocksize) {
+			bh = __bread(bdev, block, blocksize);
+			if (!bh) {
+				ntfs_err(sb, "failed to read block %llx",
+					 (u64)block);
+				return -EIO;
+			}
+		} else {
+			bh = __getblk(bdev, block, blocksize);
+			if (!bh)
+				return -ENOMEM;
+		}
+
+		if (buffer_locked(bh))
+			__wait_on_buffer(bh);
+
+		lock_buffer(bh);
+		if (buf) {
+			memcpy(bh->b_data + off, buf, op);
+			buf = Add2Ptr(buf, op);
+		} else {
+			memset(bh->b_data + off, -1, op);
+		}
+
+		set_buffer_uptodate(bh);
+		mark_buffer_dirty(bh);
+		unlock_buffer(bh);
+
+		if (wait) {
+			int err = sync_dirty_buffer(bh);
+
+			if (err) {
+				ntfs_err(
+					sb,
+					"failed to sync buffer at block %llx, error %d",
+					(u64)block, err);
+				put_bh(bh);
+				return err;
+			}
+		}
+
+		put_bh(bh);
+
+		bytes -= op;
+	}
+	return 0;
+}
+
+int ntfs_sb_write_run(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		      u64 vbo, const void *buf, size_t bytes)
+{
+	struct super_block *sb = sbi->sb;
+	u8 cluster_bits = sbi->cluster_bits;
+	u32 off = vbo & sbi->cluster_mask;
+	CLST lcn, clen, vcn = vbo >> cluster_bits, vcn_next;
+	u64 lbo, len;
+	size_t idx;
+
+	if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx))
+		return -ENOENT;
+
+	if (lcn == SPARSE_LCN)
+		return -EINVAL;
+
+	lbo = ((u64)lcn << cluster_bits) + off;
+	len = ((u64)clen << cluster_bits) - off;
+
+	for (;;) {
+		u32 op = len < bytes ? len : bytes;
+		int err = ntfs_sb_write(sb, lbo, op, buf, 0);
+
+		if (err)
+			return err;
+
+		bytes -= op;
+		if (!bytes)
+			break;
+
+		vcn_next = vcn + clen;
+		if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+		    vcn != vcn_next)
+			return -ENOENT;
+
+		if (lcn == SPARSE_LCN)
+			return -EINVAL;
+
+		if (buf)
+			buf = Add2Ptr(buf, op);
+
+		lbo = ((u64)lcn << cluster_bits);
+		len = ((u64)clen << cluster_bits);
+	}
+
+	return 0;
+}
+
+struct buffer_head *ntfs_bread_run(struct ntfs_sb_info *sbi,
+				   const struct runs_tree *run, u64 vbo)
+{
+	struct super_block *sb = sbi->sb;
+	u8 cluster_bits = sbi->cluster_bits;
+	CLST lcn;
+	u64 lbo;
+
+	if (!run_lookup_entry(run, vbo >> cluster_bits, &lcn, NULL, NULL))
+		return ERR_PTR(-ENOENT);
+
+	lbo = ((u64)lcn << cluster_bits) + (vbo & sbi->cluster_mask);
+
+	return ntfs_bread(sb, lbo >> sb->s_blocksize_bits);
+}
+
+int ntfs_read_run_nb(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		     u64 vbo, void *buf, u32 bytes, struct ntfs_buffers *nb)
+{
+	int err;
+	struct super_block *sb = sbi->sb;
+	u32 blocksize = sb->s_blocksize;
+	u8 cluster_bits = sbi->cluster_bits;
+	u32 off = vbo & sbi->cluster_mask;
+	u32 nbh = 0;
+	CLST vcn_next, vcn = vbo >> cluster_bits;
+	CLST lcn, clen;
+	u64 lbo, len;
+	size_t idx;
+	struct buffer_head *bh;
+
+	if (!run) {
+		/* first reading of $Volume + $MFTMirr + $LogFile goes here*/
+		if (vbo > MFT_REC_VOL * sbi->record_size) {
+			err = -ENOENT;
+			goto out;
+		}
+
+		/* use absolute boot's 'MFTCluster' to read record */
+		lbo = vbo + sbi->mft.lbo;
+		len = sbi->record_size;
+	} else if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+		err = -ENOENT;
+		goto out;
+	} else {
+		if (lcn == SPARSE_LCN) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		lbo = ((u64)lcn << cluster_bits) + off;
+		len = ((u64)clen << cluster_bits) - off;
+	}
+
+	off = lbo & (blocksize - 1);
+	if (nb) {
+		nb->off = off;
+		nb->bytes = bytes;
+	}
+
+	for (;;) {
+		u32 len32 = len >= bytes ? bytes : len;
+		sector_t block = lbo >> sb->s_blocksize_bits;
+
+		do {
+			u32 op = blocksize - off;
+
+			if (op > len32)
+				op = len32;
+
+			bh = ntfs_bread(sb, block);
+			if (!bh) {
+				err = -EIO;
+				goto out;
+			}
+
+			if (buf) {
+				memcpy(buf, bh->b_data + off, op);
+				buf = Add2Ptr(buf, op);
+			}
+
+			if (!nb) {
+				put_bh(bh);
+			} else if (nbh >= ARRAY_SIZE(nb->bh)) {
+				err = -EINVAL;
+				goto out;
+			} else {
+				nb->bh[nbh++] = bh;
+				nb->nbufs = nbh;
+			}
+
+			bytes -= op;
+			if (!bytes)
+				return 0;
+			len32 -= op;
+			block += 1;
+			off = 0;
+
+		} while (len32);
+
+		vcn_next = vcn + clen;
+		if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+		    vcn != vcn_next) {
+			err = -ENOENT;
+			goto out;
+		}
+
+		if (lcn == SPARSE_LCN) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		lbo = ((u64)lcn << cluster_bits);
+		len = ((u64)clen << cluster_bits);
+	}
+
+out:
+	if (!nbh)
+		return err;
+
+	while (nbh) {
+		put_bh(nb->bh[--nbh]);
+		nb->bh[nbh] = NULL;
+	}
+
+	nb->nbufs = 0;
+	return err;
+}
+
+/* Returns < 0 if error, 0 if ok, '-E_NTFS_FIXUP' if need to update fixups */
+int ntfs_read_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+		 struct NTFS_RECORD_HEADER *rhdr, u32 bytes,
+		 struct ntfs_buffers *nb)
+{
+	int err = ntfs_read_run_nb(sbi, run, vbo, rhdr, bytes, nb);
+
+	if (err)
+		return err;
+	return ntfs_fix_post_read(rhdr, nb->bytes, true);
+}
+
+int ntfs_get_bh(struct ntfs_sb_info *sbi, const struct runs_tree *run, u64 vbo,
+		u32 bytes, struct ntfs_buffers *nb)
+{
+	int err = 0;
+	struct super_block *sb = sbi->sb;
+	u32 blocksize = sb->s_blocksize;
+	u8 cluster_bits = sbi->cluster_bits;
+	CLST vcn_next, vcn = vbo >> cluster_bits;
+	u32 off;
+	u32 nbh = 0;
+	CLST lcn, clen;
+	u64 lbo, len;
+	size_t idx;
+
+	nb->bytes = bytes;
+
+	if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	off = vbo & sbi->cluster_mask;
+	lbo = ((u64)lcn << cluster_bits) + off;
+	len = ((u64)clen << cluster_bits) - off;
+
+	nb->off = off = lbo & (blocksize - 1);
+
+	for (;;) {
+		u32 len32 = len < bytes ? len : bytes;
+		sector_t block = lbo >> sb->s_blocksize_bits;
+
+		do {
+			u32 op;
+			struct buffer_head *bh;
+
+			if (nbh >= ARRAY_SIZE(nb->bh)) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			op = blocksize - off;
+			if (op > len32)
+				op = len32;
+
+			if (op == blocksize) {
+				bh = sb_getblk(sb, block);
+				if (!bh) {
+					err = -ENOMEM;
+					goto out;
+				}
+				if (buffer_locked(bh))
+					__wait_on_buffer(bh);
+				set_buffer_uptodate(bh);
+			} else {
+				bh = ntfs_bread(sb, block);
+				if (!bh) {
+					err = -EIO;
+					goto out;
+				}
+			}
+
+			nb->bh[nbh++] = bh;
+			bytes -= op;
+			if (!bytes) {
+				nb->nbufs = nbh;
+				return 0;
+			}
+
+			block += 1;
+			len32 -= op;
+			off = 0;
+		} while (len32);
+
+		vcn_next = vcn + clen;
+		if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+		    vcn != vcn_next) {
+			err = -ENOENT;
+			goto out;
+		}
+
+		lbo = ((u64)lcn << cluster_bits);
+		len = ((u64)clen << cluster_bits);
+	}
+
+out:
+	while (nbh) {
+		put_bh(nb->bh[--nbh]);
+		nb->bh[nbh] = NULL;
+	}
+
+	nb->nbufs = 0;
+
+	return err;
+}
+
+int ntfs_write_bh(struct ntfs_sb_info *sbi, struct NTFS_RECORD_HEADER *rhdr,
+		  struct ntfs_buffers *nb, int sync)
+{
+	int err = 0;
+	struct super_block *sb = sbi->sb;
+	u32 block_size = sb->s_blocksize;
+	u32 bytes = nb->bytes;
+	u32 off = nb->off;
+	u16 fo = le16_to_cpu(rhdr->fix_off);
+	u16 fn = le16_to_cpu(rhdr->fix_num);
+	u32 idx;
+	__le16 *fixup;
+	__le16 sample;
+
+	if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+	    fn * SECTOR_SIZE > bytes) {
+		return -EINVAL;
+	}
+
+	for (idx = 0; bytes && idx < nb->nbufs; idx += 1, off = 0) {
+		u32 op = block_size - off;
+		char *bh_data;
+		struct buffer_head *bh = nb->bh[idx];
+		__le16 *ptr, *end_data;
+
+		if (op > bytes)
+			op = bytes;
+
+		if (buffer_locked(bh))
+			__wait_on_buffer(bh);
+
+		lock_buffer(nb->bh[idx]);
+
+		bh_data = bh->b_data + off;
+		end_data = Add2Ptr(bh_data, op);
+		memcpy(bh_data, rhdr, op);
+
+		if (!idx) {
+			u16 t16;
+
+			fixup = Add2Ptr(bh_data, fo);
+			sample = *fixup;
+			t16 = le16_to_cpu(sample);
+			if (t16 >= 0x7FFF) {
+				sample = *fixup = cpu_to_le16(1);
+			} else {
+				sample = cpu_to_le16(t16 + 1);
+				*fixup = sample;
+			}
+
+			*(__le16 *)Add2Ptr(rhdr, fo) = sample;
+		}
+
+		ptr = Add2Ptr(bh_data, SECTOR_SIZE - sizeof(short));
+
+		do {
+			*++fixup = *ptr;
+			*ptr = sample;
+			ptr += SECTOR_SIZE / sizeof(short);
+		} while (ptr < end_data);
+
+		set_buffer_uptodate(bh);
+		mark_buffer_dirty(bh);
+		unlock_buffer(bh);
+
+		if (sync) {
+			int err2 = sync_dirty_buffer(bh);
+
+			if (!err && err2)
+				err = err2;
+		}
+
+		bytes -= op;
+		rhdr = Add2Ptr(rhdr, op);
+	}
+
+	return err;
+}
+
+static inline struct bio *ntfs_alloc_bio(u32 nr_vecs)
+{
+	struct bio *bio = bio_alloc(GFP_NOFS | __GFP_HIGH, nr_vecs);
+
+	if (!bio && (current->flags & PF_MEMALLOC)) {
+		while (!bio && (nr_vecs /= 2))
+			bio = bio_alloc(GFP_NOFS | __GFP_HIGH, nr_vecs);
+	}
+	return bio;
+}
+
+/* read/write pages from/to disk*/
+int ntfs_bio_pages(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		   struct page **pages, u32 nr_pages, u64 vbo, u32 bytes,
+		   u32 op)
+{
+	int err = 0;
+	struct bio *new, *bio = NULL;
+	struct super_block *sb = sbi->sb;
+	struct block_device *bdev = sb->s_bdev;
+	struct page *page;
+	u8 cluster_bits = sbi->cluster_bits;
+	CLST lcn, clen, vcn, vcn_next;
+	u32 add, off, page_idx;
+	u64 lbo, len;
+	size_t run_idx;
+	struct blk_plug plug;
+
+	if (!bytes)
+		return 0;
+
+	blk_start_plug(&plug);
+
+	/* align vbo and bytes to be 512 bytes aligned */
+	lbo = (vbo + bytes + 511) & ~511ull;
+	vbo = vbo & ~511ull;
+	bytes = lbo - vbo;
+
+	vcn = vbo >> cluster_bits;
+	if (!run_lookup_entry(run, vcn, &lcn, &clen, &run_idx)) {
+		err = -ENOENT;
+		goto out;
+	}
+	off = vbo & sbi->cluster_mask;
+	page_idx = 0;
+	page = pages[0];
+
+	for (;;) {
+		lbo = ((u64)lcn << cluster_bits) + off;
+		len = ((u64)clen << cluster_bits) - off;
+new_bio:
+		new = ntfs_alloc_bio(nr_pages - page_idx);
+		if (!new) {
+			err = -ENOMEM;
+			goto out;
+		}
+		if (bio) {
+			bio_chain(bio, new);
+			submit_bio(bio);
+		}
+		bio = new;
+		bio_set_dev(bio, bdev);
+		bio->bi_iter.bi_sector = lbo >> 9;
+		bio->bi_opf = op;
+
+		while (len) {
+			off = vbo & (PAGE_SIZE - 1);
+			add = off + len > PAGE_SIZE ? (PAGE_SIZE - off) : len;
+
+			if (bio_add_page(bio, page, add, off) < add)
+				goto new_bio;
+
+			if (bytes <= add)
+				goto out;
+			bytes -= add;
+			vbo += add;
+
+			if (add + off == PAGE_SIZE) {
+				page_idx += 1;
+				if (WARN_ON(page_idx >= nr_pages)) {
+					err = -EINVAL;
+					goto out;
+				}
+				page = pages[page_idx];
+			}
+
+			if (len <= add)
+				break;
+			len -= add;
+			lbo += add;
+		}
+
+		vcn_next = vcn + clen;
+		if (!run_get_entry(run, ++run_idx, &vcn, &lcn, &clen) ||
+		    vcn != vcn_next) {
+			err = -ENOENT;
+			goto out;
+		}
+		off = 0;
+	}
+out:
+	if (bio) {
+		if (!err)
+			err = submit_bio_wait(bio);
+		bio_put(bio);
+	}
+	blk_finish_plug(&plug);
+
+	return err;
+}
+
+/*
+ * Helper for ntfs_loadlog_and_replay
+ * fill on-disk logfile range by (-1)
+ * this means empty logfile
+ */
+int ntfs_bio_fill_1(struct ntfs_sb_info *sbi, const struct runs_tree *run)
+{
+	int err = 0;
+	struct super_block *sb = sbi->sb;
+	struct block_device *bdev = sb->s_bdev;
+	u8 cluster_bits = sbi->cluster_bits;
+	struct bio *new, *bio = NULL;
+	CLST lcn, clen;
+	u64 lbo, len;
+	size_t run_idx;
+	struct page *fill;
+	void *kaddr;
+	struct blk_plug plug;
+
+	fill = alloc_page(GFP_KERNEL);
+	if (!fill)
+		return -ENOMEM;
+
+	kaddr = kmap_atomic(fill);
+	memset(kaddr, -1, PAGE_SIZE);
+	kunmap_atomic(kaddr);
+	flush_dcache_page(fill);
+	lock_page(fill);
+
+	if (!run_lookup_entry(run, 0, &lcn, &clen, &run_idx)) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	/*
+	 * TODO: try blkdev_issue_write_same
+	 */
+	blk_start_plug(&plug);
+	do {
+		lbo = (u64)lcn << cluster_bits;
+		len = (u64)clen << cluster_bits;
+new_bio:
+		new = ntfs_alloc_bio(BIO_MAX_PAGES);
+		if (!new) {
+			err = -ENOMEM;
+			break;
+		}
+		if (bio) {
+			bio_chain(bio, new);
+			submit_bio(bio);
+		}
+		bio = new;
+		bio_set_dev(bio, bdev);
+		bio->bi_opf = REQ_OP_WRITE;
+		bio->bi_iter.bi_sector = lbo >> 9;
+
+		for (;;) {
+			u32 add = len > PAGE_SIZE ? PAGE_SIZE : len;
+
+			if (bio_add_page(bio, fill, add, 0) < add)
+				goto new_bio;
+
+			lbo += add;
+			if (len <= add)
+				break;
+			len -= add;
+		}
+	} while (run_get_entry(run, ++run_idx, NULL, &lcn, &clen));
+
+	if (bio) {
+		if (!err)
+			err = submit_bio_wait(bio);
+		bio_put(bio);
+	}
+	blk_finish_plug(&plug);
+out:
+	unlock_page(fill);
+	put_page(fill);
+
+	return err;
+}
+
+int ntfs_vbo_to_lbo(struct ntfs_sb_info *sbi, const struct runs_tree *run,
+		    u64 vbo, u64 *lbo, u64 *bytes)
+{
+	u32 off;
+	CLST lcn, len;
+	u8 cluster_bits = sbi->cluster_bits;
+
+	if (!run_lookup_entry(run, vbo >> cluster_bits, &lcn, &len, NULL))
+		return -ENOENT;
+
+	off = vbo & sbi->cluster_mask;
+	*lbo = lcn == SPARSE_LCN ? -1 : (((u64)lcn << cluster_bits) + off);
+	*bytes = ((u64)len << cluster_bits) - off;
+
+	return 0;
+}
+
+struct ntfs_inode *ntfs_new_inode(struct ntfs_sb_info *sbi, CLST rno, bool dir)
+{
+	int err = 0;
+	struct super_block *sb = sbi->sb;
+	struct inode *inode = new_inode(sb);
+	struct ntfs_inode *ni;
+
+	if (!inode)
+		return ERR_PTR(-ENOMEM);
+
+	ni = ntfs_i(inode);
+
+	err = mi_format_new(&ni->mi, sbi, rno, dir ? RECORD_FLAG_DIR : 0,
+			    false);
+	if (err)
+		goto out;
+
+	inode->i_ino = rno;
+	if (insert_inode_locked(inode) < 0) {
+		err = -EIO;
+		goto out;
+	}
+
+out:
+	if (err) {
+		iput(inode);
+		ni = ERR_PTR(err);
+	}
+	return ni;
+}
+
+/*
+ * O:BAG:BAD:(A;OICI;FA;;;WD)
+ * owner S-1-5-32-544 (Administrators)
+ * group S-1-5-32-544 (Administrators)
+ * ACE: allow S-1-1-0 (Everyone) with FILE_ALL_ACCESS
+ */
+const u8 s_default_security[] __aligned(8) = {
+	0x01, 0x00, 0x04, 0x80, 0x30, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x02, 0x00, 0x1C, 0x00,
+	0x01, 0x00, 0x00, 0x00, 0x00, 0x03, 0x14, 0x00, 0xFF, 0x01, 0x1F, 0x00,
+	0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00,
+	0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x20, 0x00, 0x00, 0x00,
+	0x20, 0x02, 0x00, 0x00, 0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05,
+	0x20, 0x00, 0x00, 0x00, 0x20, 0x02, 0x00, 0x00,
+};
+
+static_assert(sizeof(s_default_security) == 0x50);
+
+static inline u32 sid_length(const struct SID *sid)
+{
+	return offsetof(struct SID, SubAuthority[0]) +
+	       (sid->SubAuthorityCount * sizeof(u32));
+}
+
+/*
+ * Thanks Mark Harmstone for idea
+ */
+static bool is_acl_valid(const struct ACL *acl, u32 len)
+{
+	const struct ACE_HEADER *ace;
+	u32 i;
+	u16 ace_count, ace_size;
+
+	if (acl->AclRevision != ACL_REVISION)
+		return false;
+
+	if (acl->Sbz1)
+		return false;
+
+	if (le16_to_cpu(acl->AclSize) > len)
+		return false;
+
+	if (acl->Sbz2)
+		return false;
+
+	len -= sizeof(struct ACL);
+
+	ace = (struct ACE_HEADER *)&acl[1];
+	ace_count = le16_to_cpu(acl->AceCount);
+
+	for (i = 0; i < ace_count; i++) {
+		if (len < sizeof(struct ACE_HEADER))
+			return false;
+
+		ace_size = le16_to_cpu(ace->AceSize);
+		if (len < ace_size)
+			return false;
+
+		len -= ace_size;
+
+		ace = Add2Ptr(ace, ace_size);
+	}
+
+	return true;
+}
+
+bool is_sd_valid(const struct SECURITY_DESCRIPTOR_RELATIVE *sd, u32 len)
+{
+	u32 sd_owner, sd_group, sd_sacl, sd_dacl;
+
+	if (len < sizeof(struct SECURITY_DESCRIPTOR_RELATIVE))
+		return false;
+
+	if (sd->Revision != 1)
+		return false;
+
+	if (sd->Sbz1)
+		return false;
+
+	if (!(sd->Control & SE_SELF_RELATIVE))
+		return false;
+
+	sd_owner = le32_to_cpu(sd->Owner);
+	if (sd_owner) {
+		const struct SID *owner = Add2Ptr(sd, sd_owner);
+
+		if (sd_owner + offsetof(struct SID, SubAuthority) > len)
+			return false;
+
+		if (owner->Revision != 1)
+			return false;
+
+		if (sd_owner + sid_length(owner) > len)
+			return false;
+	}
+
+	sd_group = le32_to_cpu(sd->Group);
+	if (sd_group) {
+		const struct SID *group = Add2Ptr(sd, sd_group);
+
+		if (sd_group + offsetof(struct SID, SubAuthority) > len)
+			return false;
+
+		if (group->Revision != 1)
+			return false;
+
+		if (sd_group + sid_length(group) > len)
+			return false;
+	}
+
+	sd_sacl = le32_to_cpu(sd->Sacl);
+	if (sd_sacl) {
+		const struct ACL *sacl = Add2Ptr(sd, sd_sacl);
+
+		if (sd_sacl + sizeof(struct ACL) > len)
+			return false;
+
+		if (!is_acl_valid(sacl, len - sd_sacl))
+			return false;
+	}
+
+	sd_dacl = le32_to_cpu(sd->Dacl);
+	if (sd_dacl) {
+		const struct ACL *dacl = Add2Ptr(sd, sd_dacl);
+
+		if (sd_dacl + sizeof(struct ACL) > len)
+			return false;
+
+		if (!is_acl_valid(dacl, len - sd_dacl))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * ntfs_security_init
+ *
+ * loads and parse $Secure
+ */
+int ntfs_security_init(struct ntfs_sb_info *sbi)
+{
+	int err;
+	struct super_block *sb = sbi->sb;
+	struct inode *inode;
+	struct ntfs_inode *ni;
+	struct MFT_REF ref;
+	struct ATTRIB *attr;
+	struct ATTR_LIST_ENTRY *le;
+	u64 sds_size;
+	size_t cnt, off;
+	struct NTFS_DE *ne;
+	struct NTFS_DE_SII *sii_e;
+	struct ntfs_fnd *fnd_sii = NULL;
+	const struct INDEX_ROOT *root_sii;
+	const struct INDEX_ROOT *root_sdh;
+	struct ntfs_index *indx_sdh = &sbi->security.index_sdh;
+	struct ntfs_index *indx_sii = &sbi->security.index_sii;
+
+	ref.low = cpu_to_le32(MFT_REC_SECURE);
+	ref.high = 0;
+	ref.seq = cpu_to_le16(MFT_REC_SECURE);
+
+	inode = ntfs_iget5(sb, &ref, &NAME_SECURE);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $Secure.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	le = NULL;
+
+	attr = ni_find_attr(ni, NULL, &le, ATTR_ROOT, SDH_NAME,
+			    ARRAY_SIZE(SDH_NAME), NULL, NULL);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	root_sdh = resident_data(attr);
+	if (root_sdh->type != ATTR_ZERO ||
+	    root_sdh->rule != NTFS_COLLATION_TYPE_SECURITY_HASH) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	err = indx_init(indx_sdh, sbi, attr, INDEX_MUTEX_SDH);
+	if (err)
+		goto out;
+
+	attr = ni_find_attr(ni, attr, &le, ATTR_ROOT, SII_NAME,
+			    ARRAY_SIZE(SII_NAME), NULL, NULL);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	root_sii = resident_data(attr);
+	if (root_sii->type != ATTR_ZERO ||
+	    root_sii->rule != NTFS_COLLATION_TYPE_UINT) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	err = indx_init(indx_sii, sbi, attr, INDEX_MUTEX_SII);
+	if (err)
+		goto out;
+
+	fnd_sii = fnd_get();
+	if (!fnd_sii) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	sds_size = inode->i_size;
+
+	/* Find the last valid Id */
+	sbi->security.next_id = SECURITY_ID_FIRST;
+	/* Always write new security at the end of bucket */
+	sbi->security.next_off =
+		Quad2Align(sds_size - SecurityDescriptorsBlockSize);
+
+	cnt = 0;
+	off = 0;
+	ne = NULL;
+
+	for (;;) {
+		u32 next_id;
+
+		err = indx_find_raw(indx_sii, ni, root_sii, &ne, &off, fnd_sii);
+		if (err || !ne)
+			break;
+
+		sii_e = (struct NTFS_DE_SII *)ne;
+		if (le16_to_cpu(ne->view.data_size) < SIZEOF_SECURITY_HDR)
+			continue;
+
+		next_id = le32_to_cpu(sii_e->sec_id) + 1;
+		if (next_id >= sbi->security.next_id)
+			sbi->security.next_id = next_id;
+
+		cnt += 1;
+	}
+
+	sbi->security.ni = ni;
+	inode = NULL;
+out:
+	iput(inode);
+	fnd_put(fnd_sii);
+
+	return err;
+}
+
+/*
+ * ntfs_get_security_by_id
+ *
+ * reads security descriptor by id
+ */
+int ntfs_get_security_by_id(struct ntfs_sb_info *sbi, __le32 security_id,
+			    struct SECURITY_DESCRIPTOR_RELATIVE **sd,
+			    size_t *size)
+{
+	int err;
+	int diff;
+	struct ntfs_inode *ni = sbi->security.ni;
+	struct ntfs_index *indx = &sbi->security.index_sii;
+	void *p = NULL;
+	struct NTFS_DE_SII *sii_e;
+	struct ntfs_fnd *fnd_sii;
+	struct SECURITY_HDR d_security;
+	const struct INDEX_ROOT *root_sii;
+	u32 t32;
+
+	*sd = NULL;
+
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_SECURITY);
+
+	fnd_sii = fnd_get();
+	if (!fnd_sii) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	root_sii = indx_get_root(indx, ni, NULL, NULL);
+	if (!root_sii) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Try to find this SECURITY descriptor in SII indexes */
+	err = indx_find(indx, ni, root_sii, &security_id, sizeof(security_id),
+			NULL, &diff, (struct NTFS_DE **)&sii_e, fnd_sii);
+	if (err)
+		goto out;
+
+	if (diff)
+		goto out;
+
+	t32 = le32_to_cpu(sii_e->sec_hdr.size);
+	if (t32 < SIZEOF_SECURITY_HDR) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (t32 > SIZEOF_SECURITY_HDR + 0x10000) {
+		/*
+		 * looks like too big security. 0x10000 - is arbitrary big number
+		 */
+		err = -EFBIG;
+		goto out;
+	}
+
+	*size = t32 - SIZEOF_SECURITY_HDR;
+
+	p = ntfs_malloc(*size);
+	if (!p) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = ntfs_read_run_nb(sbi, &ni->file.run,
+			       le64_to_cpu(sii_e->sec_hdr.off), &d_security,
+			       sizeof(d_security), NULL);
+	if (err)
+		goto out;
+
+	if (memcmp(&d_security, &sii_e->sec_hdr, SIZEOF_SECURITY_HDR)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	err = ntfs_read_run_nb(sbi, &ni->file.run,
+			       le64_to_cpu(sii_e->sec_hdr.off) +
+				       SIZEOF_SECURITY_HDR,
+			       p, *size, NULL);
+	if (err)
+		goto out;
+
+	*sd = p;
+	p = NULL;
+
+out:
+	ntfs_free(p);
+	fnd_put(fnd_sii);
+	ni_unlock(ni);
+
+	return err;
+}
+
+/*
+ * ntfs_insert_security
+ *
+ * inserts security descriptor into $Secure::SDS
+ *
+ * SECURITY Descriptor Stream data is organized into chunks of 256K bytes
+ * and it contains a mirror copy of each security descriptor.  When writing
+ * to a security descriptor at location X, another copy will be written at
+ * location (X+256K).
+ * When writing a security descriptor that will cross the 256K boundary,
+ * the pointer will be advanced by 256K to skip
+ * over the mirror portion.
+ */
+int ntfs_insert_security(struct ntfs_sb_info *sbi,
+			 const struct SECURITY_DESCRIPTOR_RELATIVE *sd,
+			 u32 size_sd, __le32 *security_id, bool *inserted)
+{
+	int err, diff;
+	struct ntfs_inode *ni = sbi->security.ni;
+	struct ntfs_index *indx_sdh = &sbi->security.index_sdh;
+	struct ntfs_index *indx_sii = &sbi->security.index_sii;
+	struct NTFS_DE_SDH *e;
+	struct NTFS_DE_SDH sdh_e;
+	struct NTFS_DE_SII sii_e;
+	struct SECURITY_HDR *d_security;
+	u32 new_sec_size = size_sd + SIZEOF_SECURITY_HDR;
+	u32 aligned_sec_size = Quad2Align(new_sec_size);
+	struct SECURITY_KEY hash_key;
+	struct ntfs_fnd *fnd_sdh = NULL;
+	const struct INDEX_ROOT *root_sdh;
+	const struct INDEX_ROOT *root_sii;
+	u64 mirr_off, new_sds_size;
+	u32 next, left;
+
+	static_assert((1 << Log2OfSecurityDescriptorsBlockSize) ==
+		      SecurityDescriptorsBlockSize);
+
+	hash_key.hash = security_hash(sd, size_sd);
+	hash_key.sec_id = SECURITY_ID_INVALID;
+
+	if (inserted)
+		*inserted = false;
+	*security_id = SECURITY_ID_INVALID;
+
+	/* Allocate a temporal buffer*/
+	d_security = ntfs_zalloc(aligned_sec_size);
+	if (!d_security)
+		return -ENOMEM;
+
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_SECURITY);
+
+	fnd_sdh = fnd_get();
+	if (!fnd_sdh) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	root_sdh = indx_get_root(indx_sdh, ni, NULL, NULL);
+	if (!root_sdh) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	root_sii = indx_get_root(indx_sii, ni, NULL, NULL);
+	if (!root_sii) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/*
+	 * Check if such security already exists
+	 * use "SDH" and hash -> to get the offset in "SDS"
+	 */
+	err = indx_find(indx_sdh, ni, root_sdh, &hash_key, sizeof(hash_key),
+			&d_security->key.sec_id, &diff, (struct NTFS_DE **)&e,
+			fnd_sdh);
+	if (err)
+		goto out;
+
+	while (e) {
+		if (le32_to_cpu(e->sec_hdr.size) == new_sec_size) {
+			err = ntfs_read_run_nb(sbi, &ni->file.run,
+					       le64_to_cpu(e->sec_hdr.off),
+					       d_security, new_sec_size, NULL);
+			if (err)
+				goto out;
+
+			if (le32_to_cpu(d_security->size) == new_sec_size &&
+			    d_security->key.hash == hash_key.hash &&
+			    !memcmp(d_security + 1, sd, size_sd)) {
+				*security_id = d_security->key.sec_id;
+				/*such security already exists*/
+				err = 0;
+				goto out;
+			}
+		}
+
+		err = indx_find_sort(indx_sdh, ni, root_sdh,
+				     (struct NTFS_DE **)&e, fnd_sdh);
+		if (err)
+			goto out;
+
+		if (!e || e->key.hash != hash_key.hash)
+			break;
+	}
+
+	/* Zero unused space */
+	next = sbi->security.next_off & (SecurityDescriptorsBlockSize - 1);
+	left = SecurityDescriptorsBlockSize - next;
+
+	/* Zero gap until SecurityDescriptorsBlockSize */
+	if (left < new_sec_size) {
+		/* zero "left" bytes from sbi->security.next_off */
+		sbi->security.next_off += SecurityDescriptorsBlockSize + left;
+	}
+
+	/* Zero tail of previous security */
+	//used = ni->vfs_inode.i_size & (SecurityDescriptorsBlockSize - 1);
+
+	/*
+	 * Example:
+	 * 0x40438 == ni->vfs_inode.i_size
+	 * 0x00440 == sbi->security.next_off
+	 * need to zero [0x438-0x440)
+	 * if (next > used) {
+	 *  u32 tozero = next - used;
+	 *  zero "tozero" bytes from sbi->security.next_off - tozero
+	 */
+
+	/* format new security descriptor */
+	d_security->key.hash = hash_key.hash;
+	d_security->key.sec_id = cpu_to_le32(sbi->security.next_id);
+	d_security->off = cpu_to_le64(sbi->security.next_off);
+	d_security->size = cpu_to_le32(new_sec_size);
+	memcpy(d_security + 1, sd, size_sd);
+
+	/* Write main SDS bucket */
+	err = ntfs_sb_write_run(sbi, &ni->file.run, sbi->security.next_off,
+				d_security, aligned_sec_size);
+
+	if (err)
+		goto out;
+
+	mirr_off = sbi->security.next_off + SecurityDescriptorsBlockSize;
+	new_sds_size = mirr_off + aligned_sec_size;
+
+	if (new_sds_size > ni->vfs_inode.i_size) {
+		err = attr_set_size(ni, ATTR_DATA, SDS_NAME,
+				    ARRAY_SIZE(SDS_NAME), &ni->file.run,
+				    new_sds_size, &new_sds_size, false, NULL);
+		if (err)
+			goto out;
+	}
+
+	/* Write copy SDS bucket */
+	err = ntfs_sb_write_run(sbi, &ni->file.run, mirr_off, d_security,
+				aligned_sec_size);
+	if (err)
+		goto out;
+
+	/* Fill SII entry */
+	sii_e.de.view.data_off =
+		cpu_to_le16(offsetof(struct NTFS_DE_SII, sec_hdr));
+	sii_e.de.view.data_size = cpu_to_le16(SIZEOF_SECURITY_HDR);
+	sii_e.de.view.res = 0;
+	sii_e.de.size = cpu_to_le16(SIZEOF_SII_DIRENTRY);
+	sii_e.de.key_size = cpu_to_le16(sizeof(d_security->key.sec_id));
+	sii_e.de.flags = 0;
+	sii_e.de.res = 0;
+	sii_e.sec_id = d_security->key.sec_id;
+	memcpy(&sii_e.sec_hdr, d_security, SIZEOF_SECURITY_HDR);
+
+	err = indx_insert_entry(indx_sii, ni, &sii_e.de, NULL, NULL);
+	if (err)
+		goto out;
+
+	/* Fill SDH entry */
+	sdh_e.de.view.data_off =
+		cpu_to_le16(offsetof(struct NTFS_DE_SDH, sec_hdr));
+	sdh_e.de.view.data_size = cpu_to_le16(SIZEOF_SECURITY_HDR);
+	sdh_e.de.view.res = 0;
+	sdh_e.de.size = cpu_to_le16(SIZEOF_SDH_DIRENTRY);
+	sdh_e.de.key_size = cpu_to_le16(sizeof(sdh_e.key));
+	sdh_e.de.flags = 0;
+	sdh_e.de.res = 0;
+	sdh_e.key.hash = d_security->key.hash;
+	sdh_e.key.sec_id = d_security->key.sec_id;
+	memcpy(&sdh_e.sec_hdr, d_security, SIZEOF_SECURITY_HDR);
+	sdh_e.magic[0] = cpu_to_le16('I');
+	sdh_e.magic[1] = cpu_to_le16('I');
+
+	fnd_clear(fnd_sdh);
+	err = indx_insert_entry(indx_sdh, ni, &sdh_e.de, (void *)(size_t)1,
+				fnd_sdh);
+	if (err)
+		goto out;
+
+	*security_id = d_security->key.sec_id;
+	if (inserted)
+		*inserted = true;
+
+	/* Update Id and offset for next descriptor */
+	sbi->security.next_id += 1;
+	sbi->security.next_off += aligned_sec_size;
+
+out:
+	fnd_put(fnd_sdh);
+	mark_inode_dirty(&ni->vfs_inode);
+	ni_unlock(ni);
+	ntfs_free(d_security);
+
+	return err;
+}
+
+/*
+ * ntfs_reparse_init
+ *
+ * loads and parse $Extend/$Reparse
+ */
+int ntfs_reparse_init(struct ntfs_sb_info *sbi)
+{
+	int err;
+	struct ntfs_inode *ni = sbi->reparse.ni;
+	struct ntfs_index *indx = &sbi->reparse.index_r;
+	struct ATTRIB *attr;
+	struct ATTR_LIST_ENTRY *le;
+	const struct INDEX_ROOT *root_r;
+
+	if (!ni)
+		return 0;
+
+	le = NULL;
+	attr = ni_find_attr(ni, NULL, &le, ATTR_ROOT, SR_NAME,
+			    ARRAY_SIZE(SR_NAME), NULL, NULL);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	root_r = resident_data(attr);
+	if (root_r->type != ATTR_ZERO ||
+	    root_r->rule != NTFS_COLLATION_TYPE_UINTS) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	err = indx_init(indx, sbi, attr, INDEX_MUTEX_SR);
+	if (err)
+		goto out;
+
+out:
+	return err;
+}
+
+/*
+ * ntfs_objid_init
+ *
+ * loads and parse $Extend/$ObjId
+ */
+int ntfs_objid_init(struct ntfs_sb_info *sbi)
+{
+	int err;
+	struct ntfs_inode *ni = sbi->objid.ni;
+	struct ntfs_index *indx = &sbi->objid.index_o;
+	struct ATTRIB *attr;
+	struct ATTR_LIST_ENTRY *le;
+	const struct INDEX_ROOT *root;
+
+	if (!ni)
+		return 0;
+
+	le = NULL;
+	attr = ni_find_attr(ni, NULL, &le, ATTR_ROOT, SO_NAME,
+			    ARRAY_SIZE(SO_NAME), NULL, NULL);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	root = resident_data(attr);
+	if (root->type != ATTR_ZERO ||
+	    root->rule != NTFS_COLLATION_TYPE_UINTS) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	err = indx_init(indx, sbi, attr, INDEX_MUTEX_SO);
+	if (err)
+		goto out;
+
+out:
+	return err;
+}
+
+int ntfs_objid_remove(struct ntfs_sb_info *sbi, struct GUID *guid)
+{
+	int err;
+	struct ntfs_inode *ni = sbi->objid.ni;
+	struct ntfs_index *indx = &sbi->objid.index_o;
+
+	if (!ni)
+		return -EINVAL;
+
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_OBJID);
+
+	err = indx_delete_entry(indx, ni, guid, sizeof(*guid), NULL);
+
+	mark_inode_dirty(&ni->vfs_inode);
+	ni_unlock(ni);
+
+	return err;
+}
+
+int ntfs_insert_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+			const struct MFT_REF *ref)
+{
+	int err;
+	struct ntfs_inode *ni = sbi->reparse.ni;
+	struct ntfs_index *indx = &sbi->reparse.index_r;
+	struct NTFS_DE_R re;
+
+	if (!ni)
+		return -EINVAL;
+
+	memset(&re, 0, sizeof(re));
+
+	re.de.view.data_off = cpu_to_le16(offsetof(struct NTFS_DE_R, zero));
+	re.de.size = cpu_to_le16(sizeof(struct NTFS_DE_R));
+	re.de.key_size = cpu_to_le16(sizeof(re.key));
+
+	re.key.ReparseTag = rtag;
+	memcpy(&re.key.ref, ref, sizeof(*ref));
+
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_REPARSE);
+
+	err = indx_insert_entry(indx, ni, &re.de, NULL, NULL);
+
+	mark_inode_dirty(&ni->vfs_inode);
+	ni_unlock(ni);
+
+	return err;
+}
+
+int ntfs_remove_reparse(struct ntfs_sb_info *sbi, __le32 rtag,
+			const struct MFT_REF *ref)
+{
+	int err, diff;
+	struct ntfs_inode *ni = sbi->reparse.ni;
+	struct ntfs_index *indx = &sbi->reparse.index_r;
+	struct ntfs_fnd *fnd = NULL;
+	struct REPARSE_KEY rkey;
+	struct NTFS_DE_R *re;
+	struct INDEX_ROOT *root_r;
+
+	if (!ni)
+		return -EINVAL;
+
+	rkey.ReparseTag = rtag;
+	rkey.ref = *ref;
+
+	mutex_lock_nested(&ni->ni_lock, NTFS_INODE_MUTEX_REPARSE);
+
+	if (rtag) {
+		err = indx_delete_entry(indx, ni, &rkey, sizeof(rkey), NULL);
+		goto out1;
+	}
+
+	fnd = fnd_get();
+	if (!fnd) {
+		err = -ENOMEM;
+		goto out1;
+	}
+
+	root_r = indx_get_root(indx, ni, NULL, NULL);
+	if (!root_r) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* 1 - forces to ignore rkey.ReparseTag when comparing keys */
+	err = indx_find(indx, ni, root_r, &rkey, sizeof(rkey), (void *)1, &diff,
+			(struct NTFS_DE **)&re, fnd);
+	if (err)
+		goto out;
+
+	if (memcmp(&re->key.ref, ref, sizeof(*ref))) {
+		/* Impossible. Looks like volume corrupt?*/
+		goto out;
+	}
+
+	memcpy(&rkey, &re->key, sizeof(rkey));
+
+	fnd_put(fnd);
+	fnd = NULL;
+
+	err = indx_delete_entry(indx, ni, &rkey, sizeof(rkey), NULL);
+	if (err)
+		goto out;
+
+out:
+	fnd_put(fnd);
+
+out1:
+	mark_inode_dirty(&ni->vfs_inode);
+	ni_unlock(ni);
+
+	return err;
+}
+
+static inline void ntfs_unmap_and_discard(struct ntfs_sb_info *sbi, CLST lcn,
+					  CLST len)
+{
+	ntfs_unmap_meta(sbi->sb, lcn, len);
+	ntfs_discard(sbi, lcn, len);
+}
+
+void mark_as_free_ex(struct ntfs_sb_info *sbi, CLST lcn, CLST len, bool trim)
+{
+	CLST end, i;
+	struct wnd_bitmap *wnd = &sbi->used.bitmap;
+
+	down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+	if (!wnd_is_used(wnd, lcn, len)) {
+		ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+
+		end = lcn + len;
+		len = 0;
+		for (i = lcn; i < end; i++) {
+			if (wnd_is_used(wnd, i, 1)) {
+				if (!len)
+					lcn = i;
+				len += 1;
+				continue;
+			}
+
+			if (!len)
+				continue;
+
+			if (trim)
+				ntfs_unmap_and_discard(sbi, lcn, len);
+
+			wnd_set_free(wnd, lcn, len);
+			len = 0;
+		}
+
+		if (!len)
+			goto out;
+	}
+
+	if (trim)
+		ntfs_unmap_and_discard(sbi, lcn, len);
+	wnd_set_free(wnd, lcn, len);
+
+out:
+	up_write(&wnd->rw_lock);
+}
+
+/*
+ * run_deallocate
+ *
+ * deallocate clusters
+ */
+int run_deallocate(struct ntfs_sb_info *sbi, struct runs_tree *run, bool trim)
+{
+	CLST lcn, len;
+	size_t idx = 0;
+
+	while (run_get_entry(run, idx++, NULL, &lcn, &len)) {
+		if (lcn == SPARSE_LCN)
+			continue;
+
+		mark_as_free_ex(sbi, lcn, len, trim);
+	}
+
+	return 0;
+}
diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
new file mode 100644
index 000000000000..df2969248816
--- /dev/null
+++ b/fs/ntfs3/index.c
@@ -0,0 +1,2662 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+static const struct INDEX_NAMES {
+	const __le16 *name;
+	u8 name_len;
+} s_index_names[INDEX_MUTEX_TOTAL] = {
+	{ I30_NAME, ARRAY_SIZE(I30_NAME) }, { SII_NAME, ARRAY_SIZE(SII_NAME) },
+	{ SDH_NAME, ARRAY_SIZE(SDH_NAME) }, { SO_NAME, ARRAY_SIZE(SO_NAME) },
+	{ SQ_NAME, ARRAY_SIZE(SQ_NAME) },   { SR_NAME, ARRAY_SIZE(SR_NAME) },
+};
+
+/*
+ * compare two names in index
+ * if l1 != 0
+ *   both names are little endian on-disk ATTR_FILE_NAME structs
+ * else
+ *   f1 - cpu_str, f2 - ATTR_FILE_NAME
+ */
+static int cmp_fnames(const struct ATTR_FILE_NAME *f1, size_t l1,
+		      const struct ATTR_FILE_NAME *f2, size_t l2,
+		      const struct ntfs_sb_info *sbi)
+{
+	u16 fsize2;
+	bool both_case;
+
+	if (l2 <= offsetof(struct ATTR_FILE_NAME, name))
+		return -1;
+
+	fsize2 = fname_full_size(f2);
+	if (l2 < fsize2)
+		return -1;
+
+	both_case = f2->type != FILE_NAME_DOS /*&& !sbi->options.nocase*/;
+	if (!l1) {
+		const struct cpu_str *s1 = (struct cpu_str *)f1;
+		const struct le_str *s2 = (struct le_str *)&f2->name_len;
+
+		/*
+		 * If names are equal (case insensitive)
+		 * try to compare it case sensitive
+		 */
+		return ntfs_cmp_names_cpu(s1, s2, sbi->upcase, both_case);
+	}
+
+	return ntfs_cmp_names(f1->name, f1->name_len, f2->name, f2->name_len,
+			      sbi->upcase, both_case);
+}
+
+/* $SII of $Secure and $Q of Quota */
+static int cmp_uint(const u32 *k1, size_t l1, const u32 *k2, size_t l2,
+		    const void *p)
+{
+	if (l2 < sizeof(u32))
+		return -1;
+
+	if (*k1 < *k2)
+		return -1;
+	if (*k1 > *k2)
+		return 1;
+	return 0;
+}
+
+/* $SDH of $Secure */
+static int cmp_sdh(const struct SECURITY_KEY *k1, size_t l1,
+		   const struct SECURITY_KEY *k2, size_t l2, const void *p)
+{
+	u32 t1, t2;
+
+	if (l2 < sizeof(struct SECURITY_KEY))
+		return -1;
+
+	t1 = le32_to_cpu(k1->hash);
+	t2 = le32_to_cpu(k2->hash);
+
+	/* First value is a hash value itself */
+	if (t1 < t2)
+		return -1;
+	if (t1 > t2)
+		return 1;
+
+	/* Second value is security Id */
+	if (p) {
+		t1 = le32_to_cpu(k1->sec_id);
+		t2 = le32_to_cpu(k2->sec_id);
+		if (t1 < t2)
+			return -1;
+		if (t1 > t2)
+			return 1;
+	}
+
+	return 0;
+}
+
+/* $O of ObjId and "$R" for Reparse */
+static int cmp_uints(const __le32 *k1, size_t l1, const __le32 *k2, size_t l2,
+		     const void *p)
+{
+	size_t count;
+
+	if ((size_t)p == 1) {
+		/*
+		 * ni_delete_all -> ntfs_remove_reparse -> delete all with this reference
+		 * k1, k2 - pointers to REPARSE_KEY
+		 */
+
+		k1 += 1; // skip REPARSE_KEY.ReparseTag
+		k2 += 1; // skip REPARSE_KEY.ReparseTag
+		if (l2 <= sizeof(int))
+			return -1;
+		l2 -= sizeof(int);
+		if (l1 <= sizeof(int))
+			return 1;
+		l1 -= sizeof(int);
+	}
+
+	if (l2 < sizeof(int))
+		return -1;
+
+	for (count = min(l1, l2) >> 2; count > 0; --count, ++k1, ++k2) {
+		u32 t1 = le32_to_cpu(*k1);
+		u32 t2 = le32_to_cpu(*k2);
+
+		if (t1 > t2)
+			return 1;
+		if (t1 < t2)
+			return -1;
+	}
+
+	if (l1 > l2)
+		return 1;
+	if (l1 < l2)
+		return -1;
+
+	return 0;
+}
+
+static inline NTFS_CMP_FUNC get_cmp_func(const struct INDEX_ROOT *root)
+{
+	switch (root->type) {
+	case ATTR_NAME:
+		if (root->rule == NTFS_COLLATION_TYPE_FILENAME)
+			return (NTFS_CMP_FUNC)&cmp_fnames;
+		break;
+	case ATTR_ZERO:
+		switch (root->rule) {
+		case NTFS_COLLATION_TYPE_UINT:
+			return (NTFS_CMP_FUNC)&cmp_uint;
+		case NTFS_COLLATION_TYPE_SECURITY_HASH:
+			return (NTFS_CMP_FUNC)&cmp_sdh;
+		case NTFS_COLLATION_TYPE_UINTS:
+			return (NTFS_CMP_FUNC)&cmp_uints;
+		default:
+			break;
+		}
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+struct bmp_buf {
+	struct ATTRIB *b;
+	struct mft_inode *mi;
+	struct buffer_head *bh;
+	ulong *buf;
+	size_t bit;
+	u32 nbits;
+	u64 new_valid;
+};
+
+static int bmp_buf_get(struct ntfs_index *indx, struct ntfs_inode *ni,
+		       size_t bit, struct bmp_buf *bbuf)
+{
+	struct ATTRIB *b;
+	size_t data_size, valid_size, vbo, off = bit >> 3;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	CLST vcn = off >> sbi->cluster_bits;
+	struct ATTR_LIST_ENTRY *le = NULL;
+	struct buffer_head *bh;
+	struct super_block *sb;
+	u32 blocksize;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+	bbuf->bh = NULL;
+
+	b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+			 &vcn, &bbuf->mi);
+	bbuf->b = b;
+	if (!b)
+		return -EINVAL;
+
+	if (!b->non_res) {
+		data_size = le32_to_cpu(b->res.data_size);
+
+		if (off >= data_size)
+			return -EINVAL;
+
+		bbuf->buf = (ulong *)resident_data(b);
+		bbuf->bit = 0;
+		bbuf->nbits = data_size * 8;
+
+		return 0;
+	}
+
+	data_size = le64_to_cpu(b->nres.data_size);
+	if (WARN_ON(off >= data_size)) {
+		/* looks like filesystem error */
+		return -EINVAL;
+	}
+
+	valid_size = le64_to_cpu(b->nres.valid_size);
+
+	bh = ntfs_bread_run(sbi, &indx->bitmap_run, off);
+	if (!bh)
+		return -EIO;
+
+	if (IS_ERR(bh))
+		return PTR_ERR(bh);
+
+	bbuf->bh = bh;
+
+	if (buffer_locked(bh))
+		__wait_on_buffer(bh);
+
+	lock_buffer(bh);
+
+	sb = sbi->sb;
+	blocksize = sb->s_blocksize;
+
+	vbo = off & ~(size_t)sbi->block_mask;
+
+	bbuf->new_valid = vbo + blocksize;
+	if (bbuf->new_valid <= valid_size)
+		bbuf->new_valid = 0;
+	else if (bbuf->new_valid > data_size)
+		bbuf->new_valid = data_size;
+
+	if (vbo >= valid_size) {
+		memset(bh->b_data, 0, blocksize);
+	} else if (vbo + blocksize > valid_size) {
+		u32 voff = valid_size & sbi->block_mask;
+
+		memset(bh->b_data + voff, 0, blocksize - voff);
+	}
+
+	bbuf->buf = (ulong *)bh->b_data;
+	bbuf->bit = 8 * (off & ~(size_t)sbi->block_mask);
+	bbuf->nbits = 8 * blocksize;
+
+	return 0;
+}
+
+static void bmp_buf_put(struct bmp_buf *bbuf, bool dirty)
+{
+	struct buffer_head *bh = bbuf->bh;
+	struct ATTRIB *b = bbuf->b;
+
+	if (!bh) {
+		if (b && !b->non_res && dirty)
+			bbuf->mi->dirty = true;
+		return;
+	}
+
+	if (!dirty)
+		goto out;
+
+	if (bbuf->new_valid) {
+		b->nres.valid_size = cpu_to_le64(bbuf->new_valid);
+		bbuf->mi->dirty = true;
+	}
+
+	set_buffer_uptodate(bh);
+	mark_buffer_dirty(bh);
+
+out:
+	unlock_buffer(bh);
+	put_bh(bh);
+}
+
+/*
+ * indx_mark_used
+ *
+ * marks the bit 'bit' as used
+ */
+static int indx_mark_used(struct ntfs_index *indx, struct ntfs_inode *ni,
+			  size_t bit)
+{
+	int err;
+	struct bmp_buf bbuf;
+
+	err = bmp_buf_get(indx, ni, bit, &bbuf);
+	if (err)
+		return err;
+
+	__set_bit(bit - bbuf.bit, bbuf.buf);
+
+	bmp_buf_put(&bbuf, true);
+
+	return 0;
+}
+
+/*
+ * indx_mark_free
+ *
+ * the bit 'bit' as free
+ */
+static int indx_mark_free(struct ntfs_index *indx, struct ntfs_inode *ni,
+			  size_t bit)
+{
+	int err;
+	struct bmp_buf bbuf;
+
+	err = bmp_buf_get(indx, ni, bit, &bbuf);
+	if (err)
+		return err;
+
+	__clear_bit(bit - bbuf.bit, bbuf.buf);
+
+	bmp_buf_put(&bbuf, true);
+
+	return 0;
+}
+
+/*
+ * if ntfs_readdir calls this function (indx_used_bit -> scan_nres_bitmap),
+ * inode is shared locked and no ni_lock
+ * use rw_semaphore for read/write access to bitmap_run
+ */
+static int scan_nres_bitmap(struct ntfs_inode *ni, struct ATTRIB *bitmap,
+			    struct ntfs_index *indx, size_t from,
+			    bool (*fn)(const ulong *buf, u32 bit, u32 bits,
+				       size_t *ret),
+			    size_t *ret)
+{
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	struct super_block *sb = sbi->sb;
+	struct runs_tree *run = &indx->bitmap_run;
+	struct rw_semaphore *lock = &indx->run_lock;
+	u32 nbits = sb->s_blocksize * 8;
+	u32 blocksize = sb->s_blocksize;
+	u64 valid_size = le64_to_cpu(bitmap->nres.valid_size);
+	u64 data_size = le64_to_cpu(bitmap->nres.data_size);
+	sector_t eblock = bytes_to_block(sb, data_size);
+	size_t vbo = from >> 3;
+	sector_t blk = (vbo & sbi->cluster_mask) >> sb->s_blocksize_bits;
+	sector_t vblock = vbo >> sb->s_blocksize_bits;
+	sector_t blen, block;
+	CLST lcn, clen, vcn, vcn_next;
+	size_t idx;
+	struct buffer_head *bh;
+	bool ok;
+
+	*ret = MINUS_ONE_T;
+
+	if (vblock >= eblock)
+		return 0;
+
+	from &= nbits - 1;
+	vcn = vbo >> sbi->cluster_bits;
+
+	down_read(lock);
+	ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx);
+	up_read(lock);
+
+next_run:
+	if (!ok) {
+		int err;
+		const struct INDEX_NAMES *name = &s_index_names[indx->type];
+
+		down_write(lock);
+		err = attr_load_runs_vcn(ni, ATTR_BITMAP, name->name,
+					 name->name_len, run, vcn);
+		up_write(lock);
+		if (err)
+			return err;
+		down_read(lock);
+		ok = run_lookup_entry(run, vcn, &lcn, &clen, &idx);
+		up_read(lock);
+		if (!ok)
+			return -EINVAL;
+	}
+
+	blen = (sector_t)clen * sbi->blocks_per_cluster;
+	block = (sector_t)lcn * sbi->blocks_per_cluster;
+
+	for (; blk < blen; blk++, from = 0) {
+		bh = ntfs_bread(sb, block + blk);
+		if (!bh)
+			return -EIO;
+
+		vbo = (u64)vblock << sb->s_blocksize_bits;
+		if (vbo >= valid_size) {
+			memset(bh->b_data, 0, blocksize);
+		} else if (vbo + blocksize > valid_size) {
+			u32 voff = valid_size & sbi->block_mask;
+
+			memset(bh->b_data + voff, 0, blocksize - voff);
+		}
+
+		if (vbo + blocksize > data_size)
+			nbits = 8 * (data_size - vbo);
+
+		ok = nbits > from ?
+			     (*fn)((ulong *)bh->b_data, from, nbits, ret) :
+			     false;
+		put_bh(bh);
+
+		if (ok) {
+			*ret += 8 * vbo;
+			return 0;
+		}
+
+		if (++vblock >= eblock) {
+			*ret = MINUS_ONE_T;
+			return 0;
+		}
+	}
+	blk = 0;
+	vcn_next = vcn + clen;
+	down_read(lock);
+	ok = run_get_entry(run, ++idx, &vcn, &lcn, &clen) && vcn == vcn_next;
+	if (!ok)
+		vcn = vcn_next;
+	up_read(lock);
+	goto next_run;
+}
+
+static bool scan_for_free(const ulong *buf, u32 bit, u32 bits, size_t *ret)
+{
+	size_t pos = find_next_zero_bit(buf, bits, bit);
+
+	if (pos >= bits)
+		return false;
+	*ret = pos;
+	return true;
+}
+
+/*
+ * indx_find_free
+ *
+ * looks for free bit
+ * returns -1 if no free bits
+ */
+static int indx_find_free(struct ntfs_index *indx, struct ntfs_inode *ni,
+			  size_t *bit, struct ATTRIB **bitmap)
+{
+	struct ATTRIB *b;
+	struct ATTR_LIST_ENTRY *le = NULL;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+	int err;
+
+	b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+			 NULL, NULL);
+
+	if (!b)
+		return -ENOENT;
+
+	*bitmap = b;
+	*bit = MINUS_ONE_T;
+
+	if (!b->non_res) {
+		u32 nbits = 8 * le32_to_cpu(b->res.data_size);
+		size_t pos = find_next_zero_bit(resident_data(b), nbits, 0);
+
+		if (pos < nbits)
+			*bit = pos;
+	} else {
+		err = scan_nres_bitmap(ni, b, indx, 0, &scan_for_free, bit);
+
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static bool scan_for_used(const ulong *buf, u32 bit, u32 bits, size_t *ret)
+{
+	size_t pos = find_next_bit(buf, bits, bit);
+
+	if (pos >= bits)
+		return false;
+	*ret = pos;
+	return true;
+}
+
+/*
+ * indx_used_bit
+ *
+ * looks for used bit
+ * returns MINUS_ONE_T if no used bits
+ */
+int indx_used_bit(struct ntfs_index *indx, struct ntfs_inode *ni, size_t *bit)
+{
+	struct ATTRIB *b;
+	struct ATTR_LIST_ENTRY *le = NULL;
+	size_t from = *bit;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+	int err;
+
+	b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+			 NULL, NULL);
+
+	if (!b)
+		return -ENOENT;
+
+	*bit = MINUS_ONE_T;
+
+	if (!b->non_res) {
+		u32 nbits = le32_to_cpu(b->res.data_size) * 8;
+		size_t pos = find_next_bit(resident_data(b), nbits, from);
+
+		if (pos < nbits)
+			*bit = pos;
+	} else {
+		err = scan_nres_bitmap(ni, b, indx, from, &scan_for_used, bit);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+/*
+ * hdr_find_split
+ *
+ * finds a point at which the index allocation buffer would like to
+ * be split.
+ * NOTE: This function should never return 'END' entry NULL returns on error
+ */
+static const inline struct NTFS_DE *hdr_find_split(const struct INDEX_HDR *hdr)
+{
+	size_t o;
+	const struct NTFS_DE *e = hdr_first_de(hdr);
+	u32 used_2 = le32_to_cpu(hdr->used) >> 1;
+	u16 esize = le16_to_cpu(e->size);
+
+	if (!e || de_is_last(e))
+		return NULL;
+
+	for (o = le32_to_cpu(hdr->de_off) + esize; o < used_2; o += esize) {
+		const struct NTFS_DE *p = e;
+
+		e = Add2Ptr(hdr, o);
+
+		/* We must not return END entry */
+		if (de_is_last(e))
+			return p;
+
+		esize = le16_to_cpu(e->size);
+	}
+
+	return e;
+}
+
+/*
+ * hdr_insert_head
+ *
+ * inserts some entries at the beginning of the buffer.
+ * It is used to insert entries into a newly-created buffer.
+ */
+static const inline struct NTFS_DE *
+hdr_insert_head(struct INDEX_HDR *hdr, const void *ins, u32 ins_bytes)
+{
+	u32 to_move;
+	struct NTFS_DE *e = hdr_first_de(hdr);
+	u32 used = le32_to_cpu(hdr->used);
+
+	if (!e)
+		return NULL;
+
+	/* Now we just make room for the inserted entries and jam it in. */
+	to_move = used - le32_to_cpu(hdr->de_off);
+	memmove(Add2Ptr(e, ins_bytes), e, to_move);
+	memcpy(e, ins, ins_bytes);
+	hdr->used = cpu_to_le32(used + ins_bytes);
+
+	return e;
+}
+
+void fnd_clear(struct ntfs_fnd *fnd)
+{
+	int i;
+
+	for (i = 0; i < fnd->level; i++) {
+		struct indx_node *n = fnd->nodes[i];
+
+		if (!n)
+			continue;
+
+		put_indx_node(n);
+		fnd->nodes[i] = NULL;
+	}
+	fnd->level = 0;
+	fnd->root_de = NULL;
+}
+
+static int fnd_push(struct ntfs_fnd *fnd, struct indx_node *n,
+		    struct NTFS_DE *e)
+{
+	int i;
+
+	i = fnd->level;
+	if (i < 0 || i >= ARRAY_SIZE(fnd->nodes))
+		return -EINVAL;
+	fnd->nodes[i] = n;
+	fnd->de[i] = e;
+	fnd->level += 1;
+	return 0;
+}
+
+static struct indx_node *fnd_pop(struct ntfs_fnd *fnd)
+{
+	struct indx_node *n;
+	int i = fnd->level;
+
+	i -= 1;
+	n = fnd->nodes[i];
+	fnd->nodes[i] = NULL;
+	fnd->level = i;
+
+	return n;
+}
+
+static bool fnd_is_empty(struct ntfs_fnd *fnd)
+{
+	if (!fnd->level)
+		return !fnd->root_de;
+
+	return !fnd->de[fnd->level - 1];
+}
+
+/*
+ * hdr_find_e
+ *
+ * locates an entry the index buffer.
+ * If no matching entry is found, it returns the first entry which is greater
+ * than the desired entry If the search key is greater than all the entries the
+ * buffer, it returns the 'end' entry. This function does a binary search of the
+ * current index buffer, for the first entry that is <= to the search value
+ * Returns NULL if error
+ */
+static struct NTFS_DE *hdr_find_e(const struct ntfs_index *indx,
+				  const struct INDEX_HDR *hdr, const void *key,
+				  size_t key_len, const void *ctx, int *diff)
+{
+	struct NTFS_DE *e;
+	NTFS_CMP_FUNC cmp = indx->cmp;
+	u32 e_size, e_key_len;
+	u32 end = le32_to_cpu(hdr->used);
+	u32 off = le32_to_cpu(hdr->de_off);
+
+#ifdef NTFS3_INDEX_BINARY_SEARCH
+	int max_idx = 0, fnd, min_idx;
+	int nslots = 64;
+	u16 *offs;
+
+	if (end > 0x10000)
+		goto next;
+
+	offs = ntfs_malloc(sizeof(u16) * nslots);
+	if (!offs)
+		goto next;
+
+	/* use binary search algorithm */
+next1:
+	if (off + sizeof(struct NTFS_DE) > end) {
+		e = NULL;
+		goto out1;
+	}
+	e = Add2Ptr(hdr, off);
+	e_size = le16_to_cpu(e->size);
+
+	if (e_size < sizeof(struct NTFS_DE) || off + e_size > end) {
+		e = NULL;
+		goto out1;
+	}
+
+	if (max_idx >= nslots) {
+		u16 *ptr;
+		int new_slots = QuadAlign(2 * nslots);
+
+		ptr = ntfs_malloc(sizeof(u16) * new_slots);
+		if (ptr)
+			memcpy(ptr, offs, sizeof(u16) * max_idx);
+		ntfs_free(offs);
+		offs = ptr;
+		nslots = new_slots;
+		if (!ptr)
+			goto next;
+	}
+
+	/* Store entry table */
+	offs[max_idx] = off;
+
+	if (!de_is_last(e)) {
+		off += e_size;
+		max_idx += 1;
+		goto next1;
+	}
+
+	/*
+	 * Table of pointers is created
+	 * Use binary search to find entry that is <= to the search value
+	 */
+	fnd = -1;
+	min_idx = 0;
+
+	while (min_idx <= max_idx) {
+		int mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+		int diff2;
+
+		e = Add2Ptr(hdr, offs[mid_idx]);
+
+		e_key_len = le16_to_cpu(e->key_size);
+
+		diff2 = (*cmp)(key, key_len, e + 1, e_key_len, ctx);
+
+		if (!diff2) {
+			*diff = 0;
+			goto out1;
+		}
+
+		if (diff2 < 0) {
+			max_idx = mid_idx - 1;
+			fnd = mid_idx;
+			if (!fnd)
+				break;
+		} else {
+			min_idx = mid_idx + 1;
+		}
+	}
+
+	if (fnd == -1) {
+		e = NULL;
+		goto out1;
+	}
+
+	*diff = -1;
+	e = Add2Ptr(hdr, offs[fnd]);
+
+out1:
+	ntfs_free(offs);
+
+	return e;
+#endif
+
+next:
+	/*
+	 * Entries index are sorted
+	 * Enumerate all entries until we find entry that is <= to the search value
+	 */
+	if (off + sizeof(struct NTFS_DE) > end)
+		return NULL;
+
+	e = Add2Ptr(hdr, off);
+	e_size = le16_to_cpu(e->size);
+
+	if (e_size < sizeof(struct NTFS_DE) || off + e_size > end)
+		return NULL;
+
+	off += e_size;
+
+	e_key_len = le16_to_cpu(e->key_size);
+
+	*diff = (*cmp)(key, key_len, e + 1, e_key_len, ctx);
+	if (!*diff)
+		return e;
+
+	if (*diff <= 0)
+		return e;
+
+	if (de_is_last(e)) {
+		*diff = 1;
+		return e;
+	}
+	goto next;
+}
+
+/*
+ * hdr_insert_de
+ *
+ * inserts an index entry into the buffer.
+ * 'before' should be a pointer previously returned from hdr_find_e
+ */
+static struct NTFS_DE *hdr_insert_de(const struct ntfs_index *indx,
+				     struct INDEX_HDR *hdr,
+				     const struct NTFS_DE *de,
+				     struct NTFS_DE *before, const void *ctx)
+{
+	int diff;
+	size_t off = PtrOffset(hdr, before);
+	u32 used = le32_to_cpu(hdr->used);
+	u32 total = le32_to_cpu(hdr->total);
+	u16 de_size = le16_to_cpu(de->size);
+
+	/* First, check to see if there's enough room */
+	if (used + de_size > total)
+		return NULL;
+
+	/* We know there's enough space, so we know we'll succeed. */
+	if (before) {
+		/* Check that before is inside Index */
+		if (off >= used || off < le32_to_cpu(hdr->de_off) ||
+		    off + le16_to_cpu(before->size) > total) {
+			return NULL;
+		}
+		goto ok;
+	}
+	/* No insert point is applied. Get it manually */
+	before = hdr_find_e(indx, hdr, de + 1, le16_to_cpu(de->key_size), ctx,
+			    &diff);
+	if (!before)
+		return NULL;
+	off = PtrOffset(hdr, before);
+
+ok:
+	/* Now we just make room for the entry and jam it in. */
+	memmove(Add2Ptr(before, de_size), before, used - off);
+
+	hdr->used = cpu_to_le32(used + de_size);
+	memcpy(before, de, de_size);
+
+	return before;
+}
+
+/*
+ * hdr_delete_de
+ *
+ * removes an entry from the index buffer
+ */
+static inline struct NTFS_DE *hdr_delete_de(struct INDEX_HDR *hdr,
+					    struct NTFS_DE *re)
+{
+	u32 used = le32_to_cpu(hdr->used);
+	u16 esize = le16_to_cpu(re->size);
+	u32 off = PtrOffset(hdr, re);
+	int bytes = used - (off + esize);
+
+	if (off >= used || esize < sizeof(struct NTFS_DE) ||
+	    bytes < sizeof(struct NTFS_DE))
+		return NULL;
+
+	hdr->used = cpu_to_le32(used - esize);
+	memmove(re, Add2Ptr(re, esize), bytes);
+
+	return re;
+}
+
+void indx_clear(struct ntfs_index *indx)
+{
+	run_close(&indx->alloc_run);
+	run_close(&indx->bitmap_run);
+}
+
+int indx_init(struct ntfs_index *indx, struct ntfs_sb_info *sbi,
+	      const struct ATTRIB *attr, enum index_mutex_classed type)
+{
+	u32 t32;
+	const struct INDEX_ROOT *root = resident_data(attr);
+
+	/* Check root fields */
+	if (!root->index_block_clst)
+		return -EINVAL;
+
+	indx->type = type;
+	indx->idx2vbn_bits = __ffs(root->index_block_clst);
+
+	t32 = le32_to_cpu(root->index_block_size);
+	indx->index_bits = blksize_bits(t32);
+
+	/* Check index record size */
+	if (t32 < sbi->cluster_size) {
+		/* index record is smaller than a cluster, use 512 blocks */
+		if (t32 != root->index_block_clst * SECTOR_SIZE)
+			return -EINVAL;
+
+		/* Check alignment to a cluster */
+		if ((sbi->cluster_size >> SECTOR_SHIFT) &
+		    (root->index_block_clst - 1)) {
+			return -EINVAL;
+		}
+
+		indx->vbn2vbo_bits = SECTOR_SHIFT;
+	} else {
+		/* index record must be a multiple of cluster size */
+		if (t32 != root->index_block_clst << sbi->cluster_bits)
+			return -EINVAL;
+
+		indx->vbn2vbo_bits = sbi->cluster_bits;
+	}
+
+	init_rwsem(&indx->run_lock);
+
+	indx->cmp = get_cmp_func(root);
+	return indx->cmp ? 0 : -EINVAL;
+}
+
+static struct indx_node *indx_new(struct ntfs_index *indx,
+				  struct ntfs_inode *ni, CLST vbn,
+				  const __le64 *sub_vbn)
+{
+	int err;
+	struct NTFS_DE *e;
+	struct indx_node *r;
+	struct INDEX_HDR *hdr;
+	struct INDEX_BUFFER *index;
+	u64 vbo = (u64)vbn << indx->vbn2vbo_bits;
+	u32 bytes = 1u << indx->index_bits;
+	u16 fn;
+	u32 eo;
+
+	r = ntfs_zalloc(sizeof(struct indx_node));
+	if (!r)
+		return ERR_PTR(-ENOMEM);
+
+	index = ntfs_zalloc(bytes);
+	if (!index) {
+		ntfs_free(r);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	err = ntfs_get_bh(ni->mi.sbi, &indx->alloc_run, vbo, bytes, &r->nb);
+
+	if (err) {
+		ntfs_free(index);
+		ntfs_free(r);
+		return ERR_PTR(err);
+	}
+
+	/* Create header */
+	index->rhdr.sign = NTFS_INDX_SIGNATURE;
+	index->rhdr.fix_off = cpu_to_le16(sizeof(struct INDEX_BUFFER)); // 0x28
+	fn = (bytes >> SECTOR_SHIFT) + 1; // 9
+	index->rhdr.fix_num = cpu_to_le16(fn);
+	index->vbn = cpu_to_le64(vbn);
+	hdr = &index->ihdr;
+	eo = QuadAlign(sizeof(struct INDEX_BUFFER) + fn * sizeof(short));
+	hdr->de_off = cpu_to_le32(eo);
+
+	e = Add2Ptr(hdr, eo);
+
+	if (sub_vbn) {
+		e->flags = NTFS_IE_LAST | NTFS_IE_HAS_SUBNODES;
+		e->size = cpu_to_le16(sizeof(struct NTFS_DE) + sizeof(u64));
+		hdr->used =
+			cpu_to_le32(eo + sizeof(struct NTFS_DE) + sizeof(u64));
+		de_set_vbn_le(e, *sub_vbn);
+		hdr->flags = 1;
+	} else {
+		e->size = cpu_to_le16(sizeof(struct NTFS_DE));
+		hdr->used = cpu_to_le32(eo + sizeof(struct NTFS_DE));
+		e->flags = NTFS_IE_LAST;
+	}
+
+	hdr->total = cpu_to_le32(bytes - offsetof(struct INDEX_BUFFER, ihdr));
+
+	r->index = index;
+	return r;
+}
+
+struct INDEX_ROOT *indx_get_root(struct ntfs_index *indx, struct ntfs_inode *ni,
+				 struct ATTRIB **attr, struct mft_inode **mi)
+{
+	struct ATTR_LIST_ENTRY *le = NULL;
+	struct ATTRIB *a;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+	a = ni_find_attr(ni, NULL, &le, ATTR_ROOT, in->name, in->name_len, NULL,
+			 mi);
+	if (!a)
+		return NULL;
+
+	if (attr)
+		*attr = a;
+
+	return resident_data_ex(a, sizeof(struct INDEX_ROOT));
+}
+
+static int indx_write(struct ntfs_index *indx, struct ntfs_inode *ni,
+		      struct indx_node *node, int sync)
+{
+	struct INDEX_BUFFER *ib = node->index;
+
+	return ntfs_write_bh(ni->mi.sbi, &ib->rhdr, &node->nb, sync);
+}
+
+/*
+ * if ntfs_readdir calls this function
+ * inode is shared locked and no ni_lock
+ * use rw_semaphore for read/write access to alloc_run
+ */
+int indx_read(struct ntfs_index *indx, struct ntfs_inode *ni, CLST vbn,
+	      struct indx_node **node)
+{
+	int err;
+	struct INDEX_BUFFER *ib;
+	struct runs_tree *run = &indx->alloc_run;
+	struct rw_semaphore *lock = &indx->run_lock;
+	u64 vbo = (u64)vbn << indx->vbn2vbo_bits;
+	u32 bytes = 1u << indx->index_bits;
+	struct indx_node *in = *node;
+	const struct INDEX_NAMES *name;
+
+	if (!in) {
+		in = ntfs_zalloc(sizeof(struct indx_node));
+		if (!in)
+			return -ENOMEM;
+	} else {
+		nb_put(&in->nb);
+	}
+
+	ib = in->index;
+	if (!ib) {
+		ib = ntfs_malloc(bytes);
+		if (!ib) {
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+
+	down_read(lock);
+	err = ntfs_read_bh(ni->mi.sbi, run, vbo, &ib->rhdr, bytes, &in->nb);
+	up_read(lock);
+	if (!err)
+		goto ok;
+
+	if (err == -E_NTFS_FIXUP)
+		goto ok;
+
+	if (err != -ENOENT)
+		goto out;
+
+	name = &s_index_names[indx->type];
+	down_write(lock);
+	err = attr_load_runs_range(ni, ATTR_ALLOC, name->name, name->name_len,
+				   run, vbo, vbo + bytes);
+	up_write(lock);
+	if (err)
+		goto out;
+
+	down_read(lock);
+	err = ntfs_read_bh(ni->mi.sbi, run, vbo, &ib->rhdr, bytes, &in->nb);
+	up_read(lock);
+	if (err == -E_NTFS_FIXUP)
+		goto ok;
+
+	if (err)
+		goto out;
+
+ok:
+	if (err == -E_NTFS_FIXUP) {
+		ntfs_write_bh(ni->mi.sbi, &ib->rhdr, &in->nb, 0);
+		err = 0;
+	}
+
+	in->index = ib;
+	*node = in;
+
+out:
+	if (ib != in->index)
+		ntfs_free(ib);
+
+	if (*node != in) {
+		nb_put(&in->nb);
+		ntfs_free(in);
+	}
+
+	return err;
+}
+
+/*
+ * indx_find
+ *
+ * scans NTFS directory for given entry
+ */
+int indx_find(struct ntfs_index *indx, struct ntfs_inode *ni,
+	      const struct INDEX_ROOT *root, const void *key, size_t key_len,
+	      const void *ctx, int *diff, struct NTFS_DE **entry,
+	      struct ntfs_fnd *fnd)
+{
+	int err;
+	struct NTFS_DE *e;
+	const struct INDEX_HDR *hdr;
+	struct indx_node *node;
+
+	if (!root)
+		root = indx_get_root(&ni->dir, ni, NULL, NULL);
+
+	if (!root) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	hdr = &root->ihdr;
+
+	/* Check cache */
+	e = fnd->level ? fnd->de[fnd->level - 1] : fnd->root_de;
+	if (e && !de_is_last(e) &&
+	    !(*indx->cmp)(key, key_len, e + 1, le16_to_cpu(e->key_size), ctx)) {
+		*entry = e;
+		*diff = 0;
+		return 0;
+	}
+
+	/* Soft finder reset */
+	fnd_clear(fnd);
+
+	/* Lookup entry that is <= to the search value */
+	e = hdr_find_e(indx, hdr, key, key_len, ctx, diff);
+	if (!e)
+		return -EINVAL;
+
+	if (fnd)
+		fnd->root_de = e;
+
+	err = 0;
+
+	for (;;) {
+		node = NULL;
+		if (*diff >= 0 || !de_has_vcn_ex(e)) {
+			*entry = e;
+			goto out;
+		}
+
+		/* Read next level. */
+		err = indx_read(indx, ni, de_get_vbn(e), &node);
+		if (err)
+			goto out;
+
+		/* Lookup entry that is <= to the search value */
+		e = hdr_find_e(indx, &node->index->ihdr, key, key_len, ctx,
+			       diff);
+		if (!e) {
+			err = -EINVAL;
+			put_indx_node(node);
+			goto out;
+		}
+
+		fnd_push(fnd, node, e);
+	}
+
+out:
+	return err;
+}
+
+int indx_find_sort(struct ntfs_index *indx, struct ntfs_inode *ni,
+		   const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+		   struct ntfs_fnd *fnd)
+{
+	int err;
+	struct indx_node *n = NULL;
+	struct NTFS_DE *e;
+	size_t iter = 0;
+	int level = fnd->level;
+
+	if (!*entry) {
+		/* Start find */
+		e = hdr_first_de(&root->ihdr);
+		if (!e)
+			return 0;
+		fnd_clear(fnd);
+		fnd->root_de = e;
+	} else if (!level) {
+		if (de_is_last(fnd->root_de)) {
+			*entry = NULL;
+			return 0;
+		}
+
+		e = hdr_next_de(&root->ihdr, fnd->root_de);
+		if (!e)
+			return -EINVAL;
+		fnd->root_de = e;
+	} else {
+		n = fnd->nodes[level - 1];
+		e = fnd->de[level - 1];
+
+		if (de_is_last(e))
+			goto pop_level;
+
+		e = hdr_next_de(&n->index->ihdr, e);
+		if (!e)
+			return -EINVAL;
+
+		fnd->de[level - 1] = e;
+	}
+
+	/* Just to avoid tree cycle */
+next_iter:
+	if (iter++ >= 1000)
+		return -EINVAL;
+
+	while (de_has_vcn_ex(e)) {
+		if (le16_to_cpu(e->size) <
+		    sizeof(struct NTFS_DE) + sizeof(u64)) {
+			if (n) {
+				fnd_pop(fnd);
+				ntfs_free(n);
+			}
+			return -EINVAL;
+		}
+
+		/* Read next level */
+		err = indx_read(indx, ni, de_get_vbn(e), &n);
+		if (err)
+			return err;
+
+		/* Try next level */
+		e = hdr_first_de(&n->index->ihdr);
+		if (!e) {
+			ntfs_free(n);
+			return -EINVAL;
+		}
+
+		fnd_push(fnd, n, e);
+	}
+
+	if (le16_to_cpu(e->size) > sizeof(struct NTFS_DE)) {
+		*entry = e;
+		return 0;
+	}
+
+pop_level:
+	for (;;) {
+		if (!de_is_last(e))
+			goto next_iter;
+
+		/* Pop one level */
+		if (n) {
+			fnd_pop(fnd);
+			ntfs_free(n);
+		}
+
+		level = fnd->level;
+
+		if (level) {
+			n = fnd->nodes[level - 1];
+			e = fnd->de[level - 1];
+		} else if (fnd->root_de) {
+			n = NULL;
+			e = fnd->root_de;
+			fnd->root_de = NULL;
+		} else {
+			*entry = NULL;
+			return 0;
+		}
+
+		if (le16_to_cpu(e->size) > sizeof(struct NTFS_DE)) {
+			*entry = e;
+			if (!fnd->root_de)
+				fnd->root_de = e;
+			return 0;
+		}
+	}
+}
+
+int indx_find_raw(struct ntfs_index *indx, struct ntfs_inode *ni,
+		  const struct INDEX_ROOT *root, struct NTFS_DE **entry,
+		  size_t *off, struct ntfs_fnd *fnd)
+{
+	int err;
+	struct indx_node *n = NULL;
+	struct NTFS_DE *e = NULL;
+	struct NTFS_DE *e2;
+	size_t bit;
+	CLST next_used_vbn;
+	CLST next_vbn;
+	u32 record_size = ni->mi.sbi->record_size;
+
+	/* Use non sorted algorithm */
+	if (!*entry) {
+		/* This is the first call */
+		e = hdr_first_de(&root->ihdr);
+		if (!e)
+			return 0;
+		fnd_clear(fnd);
+		fnd->root_de = e;
+
+		/* The first call with setup of initial element */
+		if (*off >= record_size) {
+			next_vbn = (((*off - record_size) >> indx->index_bits))
+				   << indx->idx2vbn_bits;
+			/* jump inside cycle 'for'*/
+			goto next;
+		}
+
+		/* Start enumeration from root */
+		*off = 0;
+	} else if (!fnd->root_de)
+		return -EINVAL;
+
+	for (;;) {
+		/* Check if current entry can be used */
+		if (e && le16_to_cpu(e->size) > sizeof(struct NTFS_DE))
+			goto ok;
+
+		if (!fnd->level) {
+			/* Continue to enumerate root */
+			if (!de_is_last(fnd->root_de)) {
+				e = hdr_next_de(&root->ihdr, fnd->root_de);
+				if (!e)
+					return -EINVAL;
+				fnd->root_de = e;
+				continue;
+			}
+
+			/* Start to enumerate indexes from 0 */
+			next_vbn = 0;
+		} else {
+			/* Continue to enumerate indexes */
+			e2 = fnd->de[fnd->level - 1];
+
+			n = fnd->nodes[fnd->level - 1];
+
+			if (!de_is_last(e2)) {
+				e = hdr_next_de(&n->index->ihdr, e2);
+				if (!e)
+					return -EINVAL;
+				fnd->de[fnd->level - 1] = e;
+				continue;
+			}
+
+			/* Continue with next index */
+			next_vbn = le64_to_cpu(n->index->vbn) +
+				   root->index_block_clst;
+		}
+
+next:
+		/* Release current index */
+		if (n) {
+			fnd_pop(fnd);
+			put_indx_node(n);
+			n = NULL;
+		}
+
+		/* Skip all free indexes */
+		bit = next_vbn >> indx->idx2vbn_bits;
+		err = indx_used_bit(indx, ni, &bit);
+		if (err == -ENOENT || bit == MINUS_ONE_T) {
+			/* No used indexes */
+			*entry = NULL;
+			return 0;
+		}
+
+		next_used_vbn = bit << indx->idx2vbn_bits;
+
+		/* Read buffer into memory */
+		err = indx_read(indx, ni, next_used_vbn, &n);
+		if (err)
+			return err;
+
+		e = hdr_first_de(&n->index->ihdr);
+		fnd_push(fnd, n, e);
+		if (!e)
+			return -EINVAL;
+	}
+
+ok:
+	/* return offset to restore enumerator if necessary */
+	if (!n) {
+		/* 'e' points in root */
+		*off = PtrOffset(&root->ihdr, e);
+	} else {
+		/* 'e' points in index */
+		*off = (le64_to_cpu(n->index->vbn) << indx->vbn2vbo_bits) +
+		       record_size + PtrOffset(&n->index->ihdr, e);
+	}
+
+	*entry = e;
+	return 0;
+}
+
+/*
+ * indx_create_allocate
+ *
+ * create "Allocation + Bitmap" attributes
+ */
+static int indx_create_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
+				CLST *vbn)
+{
+	int err = -ENOMEM;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	struct ATTRIB *bitmap;
+	struct ATTRIB *alloc;
+	u32 alloc_size = ntfs_up_cluster(sbi, 1u << indx->index_bits);
+	u32 bmp_size = QuadAlign(((alloc_size >> indx->index_bits) + 7) >> 3);
+	CLST len = alloc_size >> sbi->cluster_bits;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+	CLST alen;
+	struct runs_tree run;
+
+	run_init(&run);
+
+	err = attr_allocate_clusters(sbi, &run, 0, 0, len, NULL, 0, &alen, 0,
+				     NULL);
+	if (err)
+		goto out;
+
+	err = ni_insert_nonresident(ni, ATTR_ALLOC, in->name, in->name_len,
+				    &run, 0, len, 0, &alloc, NULL);
+	if (err)
+		goto out1;
+
+	err = ni_insert_resident(ni, bmp_size, ATTR_BITMAP, in->name,
+				 in->name_len, &bitmap, NULL);
+	if (err)
+		goto out2;
+
+	memcpy(&indx->alloc_run, &run, sizeof(run));
+
+	*vbn = 0;
+
+	if (in->name == I30_NAME)
+		ni->vfs_inode.i_size = alloc_size;
+
+	return 0;
+
+out2:
+	mi_remove_attr(&ni->mi, alloc);
+
+out1:
+	run_deallocate(sbi, &run, false);
+
+out:
+	return err;
+}
+
+/*
+ * indx_add_allocate
+ *
+ * add clusters to index
+ */
+static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
+			     CLST *vbn)
+{
+	int err;
+	size_t bit;
+	u64 data_size, alloc_size;
+	u64 bmp_size, bmp_size_v;
+	struct ATTRIB *bmp, *alloc;
+	struct mft_inode *mi;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+	err = indx_find_free(indx, ni, &bit, &bmp);
+	if (err)
+		goto out1;
+
+	if (bit != MINUS_ONE_T) {
+		bmp = NULL;
+	} else {
+		if (bmp->non_res) {
+			bmp_size = le64_to_cpu(bmp->nres.data_size);
+			bmp_size_v = le64_to_cpu(bmp->nres.valid_size);
+		} else {
+			bmp_size = bmp_size_v = le32_to_cpu(bmp->res.data_size);
+		}
+
+		bit = bmp_size << 3;
+	}
+
+	data_size = (u64)(bit + 1) << indx->index_bits;
+	alloc_size = ntfs_up_cluster(ni->mi.sbi, data_size);
+
+	if (bmp) {
+		u64 bits = ((alloc_size >> indx->index_bits) + 7) >> 3;
+
+		/* Increase bitmap */
+		err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+				    &indx->bitmap_run, QuadAlign(bits), NULL,
+				    true, NULL);
+		if (err)
+			goto out1;
+	}
+
+	alloc = ni_find_attr(ni, NULL, NULL, ATTR_ALLOC, in->name, in->name_len,
+			     NULL, &mi);
+	if (!alloc) {
+		if (bmp)
+			goto out2;
+		goto out1;
+	}
+
+	if (alloc_size > le64_to_cpu(alloc->nres.alloc_size)) {
+		/* Increase allocation */
+		err = attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len,
+				    &indx->alloc_run, alloc_size, &alloc_size,
+				    true, NULL);
+		if (err) {
+			if (bmp)
+				goto out2;
+			goto out1;
+		}
+
+		if (in->name == I30_NAME)
+			ni->vfs_inode.i_size = alloc_size;
+	} else if (data_size > le64_to_cpu(alloc->nres.data_size)) {
+		alloc->nres.data_size = alloc->nres.valid_size =
+			cpu_to_le64(data_size);
+		mi->dirty = true;
+	}
+
+	*vbn = bit << indx->idx2vbn_bits;
+
+	return 0;
+
+out2:
+	/* Ops (no space?) */
+	attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+		      &indx->bitmap_run, bmp_size, &bmp_size_v, false, NULL);
+
+out1:
+	return err;
+}
+
+/*
+ * indx_insert_into_root
+ *
+ * attempts to insert an entry into the index root
+ * If necessary, it will twiddle the index b-tree.
+ */
+static int indx_insert_into_root(struct ntfs_index *indx, struct ntfs_inode *ni,
+				 const struct NTFS_DE *new_de,
+				 struct NTFS_DE *root_de, const void *ctx,
+				 struct ntfs_fnd *fnd)
+{
+	int err = 0;
+	struct NTFS_DE *e, *e0, *re;
+	struct mft_inode *mi;
+	struct ATTRIB *attr;
+	struct MFT_REC *rec;
+	struct INDEX_HDR *hdr;
+	struct indx_node *n;
+	CLST new_vbn;
+	__le64 *sub_vbn, t_vbn;
+	u16 new_de_size;
+	u32 hdr_used, hdr_total, asize, tail, used, aoff, to_move;
+	u32 root_size, new_root_size;
+	struct ntfs_sb_info *sbi;
+	char *next;
+	int ds_root;
+	struct INDEX_ROOT *root, *a_root = NULL;
+
+	/* Get the record this root placed in */
+	root = indx_get_root(indx, ni, &attr, &mi);
+	if (!root)
+		goto out;
+
+	/*
+	 * Try easy case:
+	 * hdr_insert_de will succeed if there's room the root for the new entry.
+	 */
+	hdr = &root->ihdr;
+	sbi = ni->mi.sbi;
+	rec = mi->mrec;
+	aoff = PtrOffset(rec, attr);
+	used = le32_to_cpu(rec->used);
+	new_de_size = le16_to_cpu(new_de->size);
+	hdr_used = le32_to_cpu(hdr->used);
+	hdr_total = le32_to_cpu(hdr->total);
+	asize = le32_to_cpu(attr->size);
+	next = Add2Ptr(attr, asize);
+	tail = used - aoff - asize;
+	root_size = le32_to_cpu(attr->res.data_size);
+
+	ds_root = new_de_size + hdr_used - hdr_total;
+
+	if (used + ds_root < sbi->max_bytes_per_attr) {
+		/* make a room for new elements */
+		memmove(next + ds_root, next, used - aoff - asize);
+		hdr->total = cpu_to_le32(hdr_total + ds_root);
+		e = hdr_insert_de(indx, hdr, new_de, root_de, ctx);
+		WARN_ON(!e);
+		fnd_clear(fnd);
+		fnd->root_de = e;
+		attr->size = cpu_to_le32(asize + ds_root);
+		attr->res.data_size = cpu_to_le32(root_size + ds_root);
+		rec->used = cpu_to_le32(used + ds_root);
+		mi->dirty = true;
+
+		return 0;
+	}
+
+	/* Make a copy of root attribute to restore if error */
+	a_root = ntfs_memdup(attr, asize);
+	if (!a_root) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	/* copy all the non-end entries from the index root to the new buffer.*/
+	to_move = 0;
+	e0 = hdr_first_de(hdr);
+
+	/* Calculate the size to copy */
+	for (e = e0;; e = hdr_next_de(hdr, e)) {
+		if (!e) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (de_is_last(e))
+			break;
+		to_move += le16_to_cpu(e->size);
+	}
+
+	n = NULL;
+	if (!to_move) {
+		re = NULL;
+	} else {
+		re = ntfs_memdup(e0, to_move);
+		if (!re) {
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+
+	sub_vbn = NULL;
+	if (de_has_vcn(e)) {
+		t_vbn = de_get_vbn_le(e);
+		sub_vbn = &t_vbn;
+	}
+
+	new_root_size = sizeof(struct INDEX_ROOT) + sizeof(struct NTFS_DE) +
+			sizeof(u64);
+	ds_root = new_root_size - root_size;
+
+	if (ds_root > 0 && used + ds_root > sbi->max_bytes_per_attr) {
+		/* make root external */
+		err = -EOPNOTSUPP;
+		goto out;
+	}
+
+	if (ds_root) {
+		memmove(next + ds_root, next, tail);
+		used += ds_root;
+		asize += ds_root;
+		rec->used = cpu_to_le32(used);
+		attr->size = cpu_to_le32(asize);
+		attr->res.data_size = cpu_to_le32(new_root_size);
+		mi->dirty = true;
+	}
+
+	/* Fill first entry (vcn will be set later) */
+	e = (struct NTFS_DE *)(root + 1);
+	memset(e, 0, sizeof(struct NTFS_DE));
+	e->size = cpu_to_le16(sizeof(struct NTFS_DE) + sizeof(u64));
+	e->flags = NTFS_IE_HAS_SUBNODES | NTFS_IE_LAST;
+
+	hdr->flags = 1;
+	hdr->used = hdr->total =
+		cpu_to_le32(new_root_size - offsetof(struct INDEX_ROOT, ihdr));
+
+	fnd->root_de = hdr_first_de(hdr);
+	mi->dirty = true;
+
+	/* Create alloc and bitmap attributes (if not) */
+	if (run_is_empty(&indx->alloc_run)) {
+		err = indx_create_allocate(indx, ni, &new_vbn);
+		if (err) {
+			/* restore root after 'indx_create_allocate' */
+			memmove(next - ds_root, next, tail);
+			used -= ds_root;
+			rec->used = cpu_to_le32(used);
+			memcpy(attr, a_root, asize);
+			goto out1;
+		}
+	} else {
+		err = indx_add_allocate(indx, ni, &new_vbn);
+		if (err)
+			goto out1;
+	}
+
+	/* layout of record may be changed, so rescan root */
+	root = indx_get_root(indx, ni, &attr, &mi);
+	if (!root) {
+		err = -EINVAL;
+		goto out1;
+	}
+
+	e = (struct NTFS_DE *)(root + 1);
+	*(__le64 *)(e + 1) = cpu_to_le64(new_vbn);
+	mi->dirty = true;
+
+	/* now we can create/format the new buffer and copy the entries into */
+	n = indx_new(indx, ni, new_vbn, sub_vbn);
+	if (IS_ERR(n)) {
+		err = PTR_ERR(n);
+		goto out1;
+	}
+
+	hdr = &n->index->ihdr;
+	hdr_used = le32_to_cpu(hdr->used);
+	hdr_total = le32_to_cpu(hdr->total);
+
+	/* Copy root entries into new buffer */
+	hdr_insert_head(hdr, re, to_move);
+
+	/* Update bitmap attribute */
+	indx_mark_used(indx, ni, new_vbn >> indx->idx2vbn_bits);
+
+	/* Check if we can insert new entry new index buffer */
+	if (hdr_used + new_de_size > hdr_total) {
+		/*
+		 * This occurs if mft record is the same or bigger than index
+		 * buffer. Move all root new index and have no space to add
+		 * new entry classic case when mft record is 1K and index
+		 * buffer 4K the problem should not occurs
+		 */
+		ntfs_free(re);
+		indx_write(indx, ni, n, 0);
+
+		put_indx_node(n);
+		fnd_clear(fnd);
+		err = indx_insert_entry(indx, ni, new_de, ctx, fnd);
+		goto out;
+	}
+
+	/*
+	 * Now root is a parent for new index buffer
+	 * Insert NewEntry a new buffer
+	 */
+	e = hdr_insert_de(indx, hdr, new_de, NULL, ctx);
+	if (!e) {
+		err = -EINVAL;
+		goto out1;
+	}
+	fnd_push(fnd, n, e);
+
+	/* Just write updates index into disk */
+	indx_write(indx, ni, n, 0);
+
+	n = NULL;
+
+out1:
+	ntfs_free(re);
+	if (n)
+		put_indx_node(n);
+
+out:
+	ntfs_free(a_root);
+	return err;
+}
+
+/*
+ * indx_insert_into_buffer
+ *
+ * attempts to insert an entry into an Index Allocation Buffer.
+ * If necessary, it will split the buffer.
+ */
+static int
+indx_insert_into_buffer(struct ntfs_index *indx, struct ntfs_inode *ni,
+			struct INDEX_ROOT *root, const struct NTFS_DE *new_de,
+			const void *ctx, int level, struct ntfs_fnd *fnd)
+{
+	int err;
+	const struct NTFS_DE *sp;
+	struct NTFS_DE *e, *de_t, *up_e = NULL;
+	struct indx_node *n2 = NULL;
+	struct indx_node *n1 = fnd->nodes[level];
+	struct INDEX_HDR *hdr1 = &n1->index->ihdr;
+	struct INDEX_HDR *hdr2;
+	u32 to_copy, used;
+	CLST new_vbn;
+	__le64 t_vbn, *sub_vbn;
+	u16 sp_size;
+
+	/* Try the most easy case */
+	e = fnd->level - 1 == level ? fnd->de[level] : NULL;
+	e = hdr_insert_de(indx, hdr1, new_de, e, ctx);
+	fnd->de[level] = e;
+	if (e) {
+		/* Just write updated index into disk */
+		indx_write(indx, ni, n1, 0);
+		return 0;
+	}
+
+	/*
+	 * No space to insert into buffer. Split it.
+	 * To split we:
+	 *  - Save split point ('cause index buffers will be changed)
+	 * - Allocate NewBuffer and copy all entries <= sp into new buffer
+	 * - Remove all entries (sp including) from TargetBuffer
+	 * - Insert NewEntry into left or right buffer (depending on sp <=>
+	 *     NewEntry)
+	 * - Insert sp into parent buffer (or root)
+	 * - Make sp a parent for new buffer
+	 */
+	sp = hdr_find_split(hdr1);
+	if (!sp)
+		return -EINVAL;
+
+	sp_size = le16_to_cpu(sp->size);
+	up_e = ntfs_malloc(sp_size + sizeof(u64));
+	if (!up_e)
+		return -ENOMEM;
+	memcpy(up_e, sp, sp_size);
+
+	if (!hdr1->flags) {
+		up_e->flags |= NTFS_IE_HAS_SUBNODES;
+		up_e->size = cpu_to_le16(sp_size + sizeof(u64));
+		sub_vbn = NULL;
+	} else {
+		t_vbn = de_get_vbn_le(up_e);
+		sub_vbn = &t_vbn;
+	}
+
+	/* Allocate on disk a new index allocation buffer. */
+	err = indx_add_allocate(indx, ni, &new_vbn);
+	if (err)
+		goto out;
+
+	/* Allocate and format memory a new index buffer */
+	n2 = indx_new(indx, ni, new_vbn, sub_vbn);
+	if (IS_ERR(n2)) {
+		err = PTR_ERR(n2);
+		goto out;
+	}
+
+	hdr2 = &n2->index->ihdr;
+
+	/* Make sp a parent for new buffer */
+	de_set_vbn(up_e, new_vbn);
+
+	/* copy all the entries <= sp into the new buffer. */
+	de_t = hdr_first_de(hdr1);
+	to_copy = PtrOffset(de_t, sp);
+	hdr_insert_head(hdr2, de_t, to_copy);
+
+	/* remove all entries (sp including) from hdr1 */
+	used = le32_to_cpu(hdr1->used) - to_copy - sp_size;
+	memmove(de_t, Add2Ptr(sp, sp_size), used - le32_to_cpu(hdr1->de_off));
+	hdr1->used = cpu_to_le32(used);
+
+	/* Insert new entry into left or right buffer (depending on sp <=> new_de) */
+	hdr_insert_de(indx,
+		      (*indx->cmp)(new_de + 1, le16_to_cpu(new_de->key_size),
+				   up_e + 1, le16_to_cpu(up_e->key_size),
+				   ctx) < 0 ?
+			      hdr2 :
+			      hdr1,
+		      new_de, NULL, ctx);
+
+	indx_mark_used(indx, ni, new_vbn >> indx->idx2vbn_bits);
+
+	indx_write(indx, ni, n1, 0);
+	indx_write(indx, ni, n2, 0);
+
+	put_indx_node(n2);
+
+	/*
+	 * we've finished splitting everybody, so we are ready to
+	 * insert the promoted entry into the parent.
+	 */
+	if (!level) {
+		/* Insert in root */
+		err = indx_insert_into_root(indx, ni, up_e, NULL, ctx, fnd);
+		if (err)
+			goto out;
+	} else {
+		/*
+		 * The target buffer's parent is another index buffer
+		 * TODO: Remove recursion
+		 */
+		err = indx_insert_into_buffer(indx, ni, root, up_e, ctx,
+					      level - 1, fnd);
+		if (err)
+			goto out;
+	}
+
+out:
+	ntfs_free(up_e);
+
+	return err;
+}
+
+/*
+ * indx_insert_entry
+ *
+ * inserts new entry into index
+ */
+int indx_insert_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+		      const struct NTFS_DE *new_de, const void *ctx,
+		      struct ntfs_fnd *fnd)
+{
+	int err;
+	int diff;
+	struct NTFS_DE *e;
+	struct ntfs_fnd *fnd_a = NULL;
+	struct INDEX_ROOT *root;
+
+	if (!fnd) {
+		fnd_a = fnd_get();
+		if (!fnd_a) {
+			err = -ENOMEM;
+			goto out1;
+		}
+		fnd = fnd_a;
+	}
+
+	root = indx_get_root(indx, ni, NULL, NULL);
+	if (!root) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (fnd_is_empty(fnd)) {
+		/* Find the spot the tree where we want to insert the new entry. */
+		err = indx_find(indx, ni, root, new_de + 1,
+				le16_to_cpu(new_de->key_size), ctx, &diff, &e,
+				fnd);
+		if (err)
+			goto out;
+
+		if (!diff) {
+			err = -EEXIST;
+			goto out;
+		}
+	}
+
+	if (!fnd->level) {
+		/* The root is also a leaf, so we'll insert the new entry into it. */
+		err = indx_insert_into_root(indx, ni, new_de, fnd->root_de, ctx,
+					    fnd);
+		if (err)
+			goto out;
+	} else {
+		/* found a leaf buffer, so we'll insert the new entry into it.*/
+		err = indx_insert_into_buffer(indx, ni, root, new_de, ctx,
+					      fnd->level - 1, fnd);
+		if (err)
+			goto out;
+	}
+
+out:
+	fnd_put(fnd_a);
+out1:
+	return err;
+}
+
+/*
+ * indx_find_buffer
+ *
+ * locates a buffer the tree.
+ */
+static struct indx_node *indx_find_buffer(struct ntfs_index *indx,
+					  struct ntfs_inode *ni,
+					  const struct INDEX_ROOT *root,
+					  __le64 vbn, struct indx_node *n)
+{
+	int err;
+	const struct NTFS_DE *e;
+	struct indx_node *r;
+	const struct INDEX_HDR *hdr = n ? &n->index->ihdr : &root->ihdr;
+
+	/* Step 1: Scan one level */
+	for (e = hdr_first_de(hdr);; e = hdr_next_de(hdr, e)) {
+		if (!e)
+			return ERR_PTR(-EINVAL);
+
+		if (de_has_vcn(e) && vbn == de_get_vbn_le(e))
+			return n;
+
+		if (de_is_last(e))
+			break;
+	}
+
+	/* Step2: Do recursion */
+	e = Add2Ptr(hdr, le32_to_cpu(hdr->de_off));
+	for (;;) {
+		if (de_has_vcn_ex(e)) {
+			err = indx_read(indx, ni, de_get_vbn(e), &n);
+			if (err)
+				return ERR_PTR(err);
+
+			r = indx_find_buffer(indx, ni, root, vbn, n);
+			if (r)
+				return r;
+		}
+
+		if (de_is_last(e))
+			break;
+
+		e = Add2Ptr(e, le16_to_cpu(e->size));
+	}
+
+	return NULL;
+}
+
+/*
+ * indx_shrink
+ *
+ * deallocates unused tail indexes
+ */
+static int indx_shrink(struct ntfs_index *indx, struct ntfs_inode *ni,
+		       size_t bit)
+{
+	int err = 0;
+	u64 bpb, new_alloc;
+	size_t nbits;
+	struct ATTRIB *b;
+	struct ATTR_LIST_ENTRY *le = NULL;
+	const struct INDEX_NAMES *in = &s_index_names[indx->type];
+
+	b = ni_find_attr(ni, NULL, &le, ATTR_BITMAP, in->name, in->name_len,
+			 NULL, NULL);
+
+	if (!b)
+		return -ENOENT;
+
+	if (!b->non_res) {
+		unsigned long pos;
+		const unsigned long *bm = resident_data(b);
+
+		nbits = le32_to_cpu(b->res.data_size) * 8;
+
+		if (bit >= nbits)
+			return 0;
+
+		pos = find_next_bit(bm, nbits, bit);
+		if (pos < nbits)
+			return 0;
+	} else {
+		size_t used = MINUS_ONE_T;
+
+		nbits = le64_to_cpu(b->nres.data_size) * 8;
+
+		if (bit >= nbits)
+			return 0;
+
+		err = scan_nres_bitmap(ni, b, indx, bit, &scan_for_used, &used);
+		if (err)
+			return err;
+
+		if (used != MINUS_ONE_T)
+			return 0;
+	}
+
+	new_alloc = (u64)bit << indx->index_bits;
+
+	err = attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len,
+			    &indx->alloc_run, new_alloc, &new_alloc, false,
+			    NULL);
+	if (err)
+		return err;
+
+	if (in->name == I30_NAME)
+		ni->vfs_inode.i_size = new_alloc;
+
+	bpb = bitmap_size(bit);
+	if (bpb * 8 == nbits)
+		return 0;
+
+	err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+			    &indx->bitmap_run, bpb, &bpb, false, NULL);
+
+	return err;
+}
+
+static int indx_free_children(struct ntfs_index *indx, struct ntfs_inode *ni,
+			      const struct NTFS_DE *e, bool trim)
+{
+	int err;
+	struct indx_node *n;
+	struct INDEX_HDR *hdr;
+	CLST vbn = de_get_vbn(e);
+	size_t i;
+
+	err = indx_read(indx, ni, vbn, &n);
+	if (err)
+		return err;
+
+	hdr = &n->index->ihdr;
+	/* First, recurse into the children, if any.*/
+	if (hdr_has_subnode(hdr)) {
+		for (e = hdr_first_de(hdr); e; e = hdr_next_de(hdr, e)) {
+			indx_free_children(indx, ni, e, false);
+			if (de_is_last(e))
+				break;
+		}
+	}
+
+	put_indx_node(n);
+
+	i = vbn >> indx->idx2vbn_bits;
+	/* We've gotten rid of the children; add this buffer to the free list. */
+	indx_mark_free(indx, ni, i);
+
+	if (!trim)
+		return 0;
+
+	/*
+	 * If there are no used indexes after current free index
+	 * then we can truncate allocation and bitmap
+	 * Use bitmap to estimate the case
+	 */
+	indx_shrink(indx, ni, i + 1);
+	return 0;
+}
+
+/*
+ * indx_get_entry_to_replace
+ *
+ * finds a replacement entry for a deleted entry
+ * always returns a node entry:
+ * NTFS_IE_HAS_SUBNODES is set the flags and the size includes the sub_vcn
+ */
+static int indx_get_entry_to_replace(struct ntfs_index *indx,
+				     struct ntfs_inode *ni,
+				     const struct NTFS_DE *de_next,
+				     struct NTFS_DE **de_to_replace,
+				     struct ntfs_fnd *fnd)
+{
+	int err;
+	int level = -1;
+	CLST vbn;
+	struct NTFS_DE *e, *te, *re;
+	struct indx_node *n;
+	struct INDEX_BUFFER *ib;
+
+	*de_to_replace = NULL;
+
+	/* Find first leaf entry down from de_next */
+	vbn = de_get_vbn(de_next);
+	for (;;) {
+		n = NULL;
+		err = indx_read(indx, ni, vbn, &n);
+		if (err)
+			goto out;
+
+		e = hdr_first_de(&n->index->ihdr);
+		fnd_push(fnd, n, e);
+
+		if (!de_is_last(e)) {
+			/*
+			 * This buffer is non-empty, so its first entry could be used as the
+			 * replacement entry.
+			 */
+			level = fnd->level - 1;
+		}
+
+		if (!de_has_vcn(e))
+			break;
+
+		/* This buffer is a node. Continue to go down */
+		vbn = de_get_vbn(e);
+	}
+
+	if (level == -1)
+		goto out;
+
+	n = fnd->nodes[level];
+	te = hdr_first_de(&n->index->ihdr);
+	/* Copy the candidate entry into the replacement entry buffer. */
+	re = ntfs_malloc(le16_to_cpu(te->size) + sizeof(u64));
+	if (!re) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	*de_to_replace = re;
+	memcpy(re, te, le16_to_cpu(te->size));
+
+	if (!de_has_vcn(re)) {
+		/*
+		 * The replacement entry we found doesn't have a sub_vcn. increase its size
+		 * to hold one.
+		 */
+		le16_add_cpu(&re->size, sizeof(u64));
+		re->flags |= NTFS_IE_HAS_SUBNODES;
+	} else {
+		/*
+		 * The replacement entry we found was a node entry, which means that all
+		 * its child buffers are empty. Return them to the free pool.
+		 */
+		indx_free_children(indx, ni, te, true);
+	}
+
+	/*
+	 * Expunge the replacement entry from its former location,
+	 * and then write that buffer.
+	 */
+	ib = n->index;
+	e = hdr_delete_de(&ib->ihdr, te);
+
+	fnd->de[level] = e;
+	indx_write(indx, ni, n, 0);
+
+	/* Check to see if this action created an empty leaf. */
+	if (ib_is_leaf(ib) && ib_is_empty(ib))
+		return 0;
+
+out:
+	fnd_clear(fnd);
+	return err;
+}
+
+/*
+ * indx_delete_entry
+ *
+ * deletes an entry from the index.
+ */
+int indx_delete_entry(struct ntfs_index *indx, struct ntfs_inode *ni,
+		      const void *key, u32 key_len, const void *ctx)
+{
+	int err, diff;
+	struct INDEX_ROOT *root;
+	struct INDEX_HDR *hdr;
+	struct ntfs_fnd *fnd, *fnd2;
+	struct INDEX_BUFFER *ib;
+	struct NTFS_DE *e, *re, *next, *prev, *me;
+	struct indx_node *n, *n2d = NULL;
+	__le64 sub_vbn;
+	int level, level2;
+	struct ATTRIB *attr;
+	struct mft_inode *mi;
+	u32 e_size, root_size, new_root_size;
+	size_t trim_bit;
+	const struct INDEX_NAMES *in;
+
+	fnd = fnd_get();
+	if (!fnd) {
+		err = -ENOMEM;
+		goto out2;
+	}
+
+	fnd2 = fnd_get();
+	if (!fnd2) {
+		err = -ENOMEM;
+		goto out1;
+	}
+
+	root = indx_get_root(indx, ni, &attr, &mi);
+	if (!root) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Locate the entry to remove. */
+	err = indx_find(indx, ni, root, key, key_len, ctx, &diff, &e, fnd);
+	if (err)
+		goto out;
+
+	if (!e || diff) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	level = fnd->level;
+
+	if (level) {
+		n = fnd->nodes[level - 1];
+		e = fnd->de[level - 1];
+		ib = n->index;
+		hdr = &ib->ihdr;
+	} else {
+		hdr = &root->ihdr;
+		e = fnd->root_de;
+		n = NULL;
+	}
+
+	e_size = le16_to_cpu(e->size);
+
+	if (!de_has_vcn_ex(e)) {
+		/* The entry to delete is a leaf, so we can just rip it out */
+		hdr_delete_de(hdr, e);
+
+		if (!level) {
+			hdr->total = hdr->used;
+
+			/* Shrink resident root attribute */
+			mi_resize_attr(mi, attr, 0 - e_size);
+			goto out;
+		}
+
+		indx_write(indx, ni, n, 0);
+
+		/*
+		 * Check to see if removing that entry made
+		 * the leaf empty.
+		 */
+		if (ib_is_leaf(ib) && ib_is_empty(ib)) {
+			fnd_pop(fnd);
+			fnd_push(fnd2, n, e);
+		}
+	} else {
+		/*
+		 * The entry we wish to delete is a node buffer, so we
+		 * have to find a replacement for it.
+		 */
+		next = de_get_next(e);
+
+		err = indx_get_entry_to_replace(indx, ni, next, &re, fnd2);
+		if (err)
+			goto out;
+
+		if (re) {
+			de_set_vbn_le(re, de_get_vbn_le(e));
+			hdr_delete_de(hdr, e);
+
+			err = level ? indx_insert_into_buffer(indx, ni, root,
+							      re, ctx,
+							      fnd->level - 1,
+							      fnd) :
+				      indx_insert_into_root(indx, ni, re, e,
+							    ctx, fnd);
+			ntfs_free(re);
+
+			if (err)
+				goto out;
+		} else {
+			/*
+			 * There is no replacement for the current entry.
+			 * This means that the subtree rooted at its node is empty,
+			 * and can be deleted, which turn means that the node can
+			 * just inherit the deleted entry sub_vcn
+			 */
+			indx_free_children(indx, ni, next, true);
+
+			de_set_vbn_le(next, de_get_vbn_le(e));
+			hdr_delete_de(hdr, e);
+			if (level) {
+				indx_write(indx, ni, n, 0);
+			} else {
+				hdr->total = hdr->used;
+
+				/* Shrink resident root attribute */
+				mi_resize_attr(mi, attr, 0 - e_size);
+			}
+		}
+	}
+
+	/* Delete a branch of tree */
+	if (!fnd2 || !fnd2->level)
+		goto out;
+
+	/* Reinit root 'cause it can be changed */
+	root = indx_get_root(indx, ni, &attr, &mi);
+	if (!root) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	n2d = NULL;
+	sub_vbn = fnd2->nodes[0]->index->vbn;
+	level2 = 0;
+	level = fnd->level;
+
+	hdr = level ? &fnd->nodes[level - 1]->index->ihdr : &root->ihdr;
+
+	/* Scan current level */
+	for (e = hdr_first_de(hdr);; e = hdr_next_de(hdr, e)) {
+		if (!e) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (de_has_vcn(e) && sub_vbn == de_get_vbn_le(e))
+			break;
+
+		if (de_is_last(e)) {
+			e = NULL;
+			break;
+		}
+	}
+
+	if (!e) {
+		/* Do slow search from root */
+		struct indx_node *in;
+
+		fnd_clear(fnd);
+
+		in = indx_find_buffer(indx, ni, root, sub_vbn, NULL);
+		if (IS_ERR(in)) {
+			err = PTR_ERR(in);
+			goto out;
+		}
+
+		if (in)
+			fnd_push(fnd, in, NULL);
+	}
+
+	/* Merge fnd2 -> fnd */
+	for (level = 0; level < fnd2->level; level++) {
+		fnd_push(fnd, fnd2->nodes[level], fnd2->de[level]);
+		fnd2->nodes[level] = NULL;
+	}
+	fnd2->level = 0;
+
+	hdr = NULL;
+	for (level = fnd->level; level; level--) {
+		struct indx_node *in = fnd->nodes[level - 1];
+
+		ib = in->index;
+		if (ib_is_empty(ib)) {
+			sub_vbn = ib->vbn;
+		} else {
+			hdr = &ib->ihdr;
+			n2d = in;
+			level2 = level;
+			break;
+		}
+	}
+
+	if (!hdr)
+		hdr = &root->ihdr;
+
+	e = hdr_first_de(hdr);
+	if (!e) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (hdr != &root->ihdr || !de_is_last(e)) {
+		prev = NULL;
+		while (!de_is_last(e)) {
+			if (de_has_vcn(e) && sub_vbn == de_get_vbn_le(e))
+				break;
+			prev = e;
+			e = hdr_next_de(hdr, e);
+			if (!e) {
+				err = -EINVAL;
+				goto out;
+			}
+		}
+
+		if (sub_vbn != de_get_vbn_le(e)) {
+			/*
+			 * Didn't find the parent entry, although this buffer is the parent trail.
+			 * Something is corrupt.
+			 */
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (de_is_last(e)) {
+			/*
+			 * Since we can't remove the end entry, we'll remove its
+			 * predecessor instead. This means we have to transfer the
+			 * predecessor's sub_vcn to the end entry.
+			 * Note: that this index block is not empty, so the
+			 * predecessor must exist
+			 */
+			if (!prev) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			if (de_has_vcn(prev)) {
+				de_set_vbn_le(e, de_get_vbn_le(prev));
+			} else if (de_has_vcn(e)) {
+				le16_sub_cpu(&e->size, sizeof(u64));
+				e->flags &= ~NTFS_IE_HAS_SUBNODES;
+				le32_sub_cpu(&hdr->used, sizeof(u64));
+			}
+			e = prev;
+		}
+
+		/*
+		 * Copy the current entry into a temporary buffer (stripping off its
+		 * down-pointer, if any) and delete it from the current buffer or root,
+		 * as appropriate.
+		 */
+		e_size = le16_to_cpu(e->size);
+		me = ntfs_memdup(e, e_size);
+		if (!me) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		if (de_has_vcn(me)) {
+			me->flags &= ~NTFS_IE_HAS_SUBNODES;
+			le16_sub_cpu(&me->size, sizeof(u64));
+		}
+
+		hdr_delete_de(hdr, e);
+
+		if (hdr == &root->ihdr) {
+			level = 0;
+			hdr->total = hdr->used;
+
+			/* Shrink resident root attribute */
+			mi_resize_attr(mi, attr, 0 - e_size);
+		} else {
+			indx_write(indx, ni, n2d, 0);
+			level = level2;
+		}
+
+		/* Mark unused buffers as free */
+		trim_bit = -1;
+		for (; level < fnd->level; level++) {
+			ib = fnd->nodes[level]->index;
+			if (ib_is_empty(ib)) {
+				size_t k = le64_to_cpu(ib->vbn) >>
+					   indx->idx2vbn_bits;
+
+				indx_mark_free(indx, ni, k);
+				if (k < trim_bit)
+					trim_bit = k;
+			}
+		}
+
+		fnd_clear(fnd);
+		/*fnd->root_de = NULL;*/
+
+		/*
+		 * Re-insert the entry into the tree.
+		 * Find the spot the tree where we want to insert the new entry.
+		 */
+		err = indx_insert_entry(indx, ni, me, ctx, fnd);
+		ntfs_free(me);
+		if (err)
+			goto out;
+
+		if (trim_bit != -1)
+			indx_shrink(indx, ni, trim_bit);
+	} else {
+		/*
+		 * This tree needs to be collapsed down to an empty root.
+		 * Recreate the index root as an empty leaf and free all the bits the
+		 * index allocation bitmap.
+		 */
+		fnd_clear(fnd);
+		fnd_clear(fnd2);
+
+		in = &s_index_names[indx->type];
+
+		err = attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len,
+				    &indx->alloc_run, 0, NULL, false, NULL);
+		err = ni_remove_attr(ni, ATTR_ALLOC, in->name, in->name_len,
+				     false, NULL);
+		run_close(&indx->alloc_run);
+
+		err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
+				    &indx->bitmap_run, 0, NULL, false, NULL);
+		err = ni_remove_attr(ni, ATTR_BITMAP, in->name, in->name_len,
+				     false, NULL);
+		run_close(&indx->bitmap_run);
+
+		root = indx_get_root(indx, ni, &attr, &mi);
+		if (!root) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		root_size = le32_to_cpu(attr->res.data_size);
+		new_root_size =
+			sizeof(struct INDEX_ROOT) + sizeof(struct NTFS_DE);
+
+		if (new_root_size != root_size &&
+		    !mi_resize_attr(mi, attr, new_root_size - root_size)) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		/* Fill first entry */
+		e = (struct NTFS_DE *)(root + 1);
+		e->ref.low = 0;
+		e->ref.high = 0;
+		e->ref.seq = 0;
+		e->size = cpu_to_le16(sizeof(struct NTFS_DE));
+		e->flags = NTFS_IE_LAST; // 0x02
+		e->key_size = 0;
+		e->res = 0;
+
+		hdr = &root->ihdr;
+		hdr->flags = 0;
+		hdr->used = hdr->total = cpu_to_le32(
+			new_root_size - offsetof(struct INDEX_ROOT, ihdr));
+		mi->dirty = true;
+
+		if (in->name == I30_NAME)
+			ni->vfs_inode.i_size = 0;
+	}
+
+out:
+	fnd_put(fnd2);
+out1:
+	fnd_put(fnd);
+out2:
+	return err;
+}
+
+int indx_update_dup(struct ntfs_inode *ni, struct ntfs_sb_info *sbi,
+		    const struct ATTR_FILE_NAME *fname,
+		    const struct NTFS_DUP_INFO *dup, int sync)
+{
+	int err, diff;
+	struct NTFS_DE *e = NULL;
+	struct ATTR_FILE_NAME *e_fname;
+	struct ntfs_fnd *fnd;
+	struct INDEX_ROOT *root;
+	struct mft_inode *mi;
+	struct ntfs_index *indx = &ni->dir;
+
+	fnd = fnd_get();
+	if (!fnd) {
+		err = -ENOMEM;
+		goto out1;
+	}
+
+	root = indx_get_root(indx, ni, NULL, &mi);
+	if (!root) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Find entries tree and on disk */
+	err = indx_find(indx, ni, root, fname, fname_full_size(fname), sbi,
+			&diff, &e, fnd);
+	if (err)
+		goto out;
+
+	if (!e) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (diff) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	e_fname = (struct ATTR_FILE_NAME *)(e + 1);
+
+	if (!memcmp(&e_fname->dup, dup, sizeof(*dup))) {
+		/* nothing to update in index! Try to avoid this call */
+		goto out;
+	}
+
+	memcpy(&e_fname->dup, dup, sizeof(*dup));
+
+	if (fnd->level) {
+		err = indx_write(indx, ni, fnd->nodes[fnd->level - 1], sync);
+	} else if (sync) {
+		mi->dirty = true;
+		err = mi_write(mi, 1);
+	} else {
+		mi->dirty = true;
+		mark_inode_dirty(&ni->vfs_inode);
+	}
+
+out:
+	fnd_put(fnd);
+
+out1:
+	return err;
+}
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
new file mode 100644
index 000000000000..4b97065be284
--- /dev/null
+++ b/fs/ntfs3/inode.c
@@ -0,0 +1,2050 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/iversion.h>
+#include <linux/mpage.h>
+#include <linux/namei.h>
+#include <linux/nls.h>
+#include <linux/uio.h>
+#include <linux/version.h>
+#include <linux/writeback.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * ntfs_read_mft
+ *
+ * reads record and parses MFT
+ */
+static struct inode *ntfs_read_mft(struct inode *inode,
+				   const struct cpu_str *name,
+				   const struct MFT_REF *ref)
+{
+	int err = 0;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	struct super_block *sb = inode->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	mode_t mode = 0;
+	struct ATTR_STD_INFO5 *std5 = NULL;
+	struct ATTR_LIST_ENTRY *le;
+	struct ATTRIB *attr;
+	bool is_match = false;
+	bool is_root = false;
+	bool is_dir;
+	unsigned long ino = inode->i_ino;
+	u32 rp_fa = 0, asize, t32;
+	u16 roff, rsize, names = 0;
+	const struct ATTR_FILE_NAME *fname = NULL;
+	const struct INDEX_ROOT *root;
+	struct REPARSE_DATA_BUFFER rp; // 0x18 bytes
+	u64 t64;
+	struct MFT_REC *rec;
+	struct runs_tree *run;
+
+	inode->i_op = NULL;
+
+	err = mi_init(&ni->mi, sbi, ino);
+	if (err)
+		goto out;
+
+	if (!sbi->mft.ni && ino == MFT_REC_MFT && !sb->s_root) {
+		t64 = sbi->mft.lbo >> sbi->cluster_bits;
+		t32 = bytes_to_cluster(sbi, MFT_REC_VOL * sbi->record_size);
+		sbi->mft.ni = ni;
+		init_rwsem(&ni->file.run_lock);
+
+		if (!run_add_entry(&ni->file.run, 0, t64, t32, true)) {
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+
+	err = mi_read(&ni->mi, ino == MFT_REC_MFT);
+
+	if (err)
+		goto out;
+
+	rec = ni->mi.mrec;
+
+	if (sbi->flags & NTFS_FLAGS_LOG_REPLAYING) {
+		;
+	} else if (ref->seq != rec->seq) {
+		err = -EINVAL;
+		ntfs_err(sb, "MFT: r=%lx, expect seq=%x instead of %x!", ino,
+			 le16_to_cpu(ref->seq), le16_to_cpu(rec->seq));
+		goto out;
+	} else if (!is_rec_inuse(rec)) {
+		err = -EINVAL;
+		ntfs_err(sb, "Inode r=%x is not in use!", (u32)ino);
+		goto out;
+	}
+
+	if (le32_to_cpu(rec->total) != sbi->record_size) {
+		// bad inode?
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (!is_rec_base(rec))
+		goto Ok;
+
+	/* record should contain $I30 root */
+	is_dir = rec->flags & RECORD_FLAG_DIR;
+
+	inode->i_generation = le16_to_cpu(rec->seq);
+
+	/* Enumerate all struct Attributes MFT */
+	le = NULL;
+	attr = NULL;
+
+	/*
+	 * to reduce tab pressure use goto instead of
+	 * while( (attr = ni_enum_attr_ex(ni, attr, &le, NULL) ))
+	 */
+next_attr:
+	run = NULL;
+	err = -EINVAL;
+	attr = ni_enum_attr_ex(ni, attr, &le, NULL);
+	if (!attr)
+		goto end_enum;
+
+	if (le && le->vcn) {
+		/* This is non primary attribute segment. Ignore if not MFT */
+		if (ino != MFT_REC_MFT || attr->type != ATTR_DATA)
+			goto next_attr;
+
+		run = &ni->file.run;
+		asize = le32_to_cpu(attr->size);
+		goto attr_unpack_run;
+	}
+
+	roff = attr->non_res ? 0 : le16_to_cpu(attr->res.data_off);
+	rsize = attr->non_res ? 0 : le32_to_cpu(attr->res.data_size);
+	asize = le32_to_cpu(attr->size);
+
+	switch (attr->type) {
+	case ATTR_STD:
+		if (attr->non_res ||
+		    asize < sizeof(struct ATTR_STD_INFO) + roff ||
+		    rsize < sizeof(struct ATTR_STD_INFO))
+			goto out;
+
+		if (std5)
+			goto next_attr;
+
+		std5 = Add2Ptr(attr, roff);
+
+#ifdef STATX_BTIME
+		nt2kernel(std5->cr_time, &ni->i_crtime);
+#endif
+		nt2kernel(std5->a_time, &inode->i_atime);
+		nt2kernel(std5->c_time, &inode->i_ctime);
+		nt2kernel(std5->m_time, &inode->i_mtime);
+
+		ni->std_fa = std5->fa;
+
+		if (asize >= sizeof(struct ATTR_STD_INFO5) + roff &&
+		    rsize >= sizeof(struct ATTR_STD_INFO5))
+			ni->std_security_id = std5->security_id;
+		goto next_attr;
+
+	case ATTR_LIST:
+		if (attr->name_len || le || ino == MFT_REC_LOG)
+			goto out;
+
+		err = ntfs_load_attr_list(ni, attr);
+		if (err)
+			goto out;
+
+		le = NULL;
+		attr = NULL;
+		goto next_attr;
+
+	case ATTR_NAME:
+		if (attr->non_res || asize < SIZEOF_ATTRIBUTE_FILENAME + roff ||
+		    rsize < SIZEOF_ATTRIBUTE_FILENAME)
+			goto out;
+
+		fname = Add2Ptr(attr, roff);
+		if (fname->type == FILE_NAME_DOS)
+			goto next_attr;
+
+		names += 1;
+		if (name && name->len == fname->name_len &&
+		    !ntfs_cmp_names_cpu(name, (struct le_str *)&fname->name_len,
+					NULL, false))
+			is_match = true;
+
+		goto next_attr;
+
+	case ATTR_DATA:
+		if (is_dir) {
+			/* ignore data attribute in dir record */
+			goto next_attr;
+		}
+
+		if (ino == MFT_REC_BADCLUST && !attr->non_res)
+			goto next_attr;
+
+		if (attr->name_len &&
+		    ((ino != MFT_REC_BADCLUST || !attr->non_res ||
+		      attr->name_len != ARRAY_SIZE(BAD_NAME) ||
+		      memcmp(attr_name(attr), BAD_NAME, sizeof(BAD_NAME))) &&
+		     (ino != MFT_REC_SECURE || !attr->non_res ||
+		      attr->name_len != ARRAY_SIZE(SDS_NAME) ||
+		      memcmp(attr_name(attr), SDS_NAME, sizeof(SDS_NAME))))) {
+			/* file contains stream attribute. ignore it */
+			goto next_attr;
+		}
+
+		if (is_attr_sparsed(attr))
+			ni->std_fa |= FILE_ATTRIBUTE_SPARSE_FILE;
+		else
+			ni->std_fa &= ~FILE_ATTRIBUTE_SPARSE_FILE;
+
+		if (is_attr_compressed(attr))
+			ni->std_fa |= FILE_ATTRIBUTE_COMPRESSED;
+		else
+			ni->std_fa &= ~FILE_ATTRIBUTE_COMPRESSED;
+
+		if (is_attr_encrypted(attr))
+			ni->std_fa |= FILE_ATTRIBUTE_ENCRYPTED;
+		else
+			ni->std_fa &= ~FILE_ATTRIBUTE_ENCRYPTED;
+
+		if (!attr->non_res) {
+			ni->i_valid = inode->i_size = rsize;
+			inode_set_bytes(inode, rsize);
+			t32 = asize;
+		} else {
+			t32 = le16_to_cpu(attr->nres.run_off);
+		}
+
+		mode = S_IFREG | (0777 & sbi->options.fs_fmask_inv);
+
+		if (!attr->non_res) {
+			ni->ni_flags |= NI_FLAG_RESIDENT;
+			goto next_attr;
+		}
+
+		inode_set_bytes(inode, attr_ondisk_size(attr));
+
+		ni->i_valid = le64_to_cpu(attr->nres.valid_size);
+		inode->i_size = le64_to_cpu(attr->nres.data_size);
+		if (!attr->nres.alloc_size)
+			goto next_attr;
+
+		run = ino == MFT_REC_BITMAP ? &sbi->used.bitmap.run :
+					      &ni->file.run;
+		break;
+
+	case ATTR_ROOT:
+		if (attr->non_res)
+			goto out;
+
+		root = Add2Ptr(attr, roff);
+		is_root = true;
+
+		if (attr->name_len != ARRAY_SIZE(I30_NAME) ||
+		    memcmp(attr_name(attr), I30_NAME, sizeof(I30_NAME)))
+			goto next_attr;
+
+		if (root->type != ATTR_NAME ||
+		    root->rule != NTFS_COLLATION_TYPE_FILENAME)
+			goto out;
+
+		if (!is_dir)
+			goto next_attr;
+
+		ni->ni_flags |= NI_FLAG_DIR;
+
+		err = indx_init(&ni->dir, sbi, attr, INDEX_MUTEX_I30);
+		if (err)
+			goto out;
+
+		mode = sb->s_root ?
+			       (S_IFDIR | (0777 & sbi->options.fs_dmask_inv)) :
+			       (S_IFDIR | 0777);
+		goto next_attr;
+
+	case ATTR_ALLOC:
+		if (!is_root || attr->name_len != ARRAY_SIZE(I30_NAME) ||
+		    memcmp(attr_name(attr), I30_NAME, sizeof(I30_NAME)))
+			goto next_attr;
+
+		inode->i_size = le64_to_cpu(attr->nres.data_size);
+		ni->i_valid = le64_to_cpu(attr->nres.valid_size);
+		inode_set_bytes(inode, le64_to_cpu(attr->nres.alloc_size));
+
+		run = &ni->dir.alloc_run;
+		break;
+
+	case ATTR_BITMAP:
+		if (ino == MFT_REC_MFT) {
+			if (!attr->non_res)
+				goto out;
+#ifndef NTFS3_64BIT_CLUSTER
+			/* 0x20000000 = 2^32 / 8 */
+			if (le64_to_cpu(attr->nres.alloc_size) >= 0x20000000)
+				goto out;
+#endif
+			run = &sbi->mft.bitmap.run;
+			break;
+		} else if (is_dir && attr->name_len == ARRAY_SIZE(I30_NAME) &&
+			   !memcmp(attr_name(attr), I30_NAME,
+				   sizeof(I30_NAME)) &&
+			   attr->non_res) {
+			run = &ni->dir.bitmap_run;
+			break;
+		}
+		goto next_attr;
+
+	case ATTR_REPARSE:
+		if (attr->name_len)
+			goto next_attr;
+
+		rp_fa = ni_parse_reparse(ni, attr, &rp);
+		switch (rp_fa) {
+		case REPARSE_LINK:
+			if (!attr->non_res) {
+				inode->i_size = rsize;
+				inode_set_bytes(inode, rsize);
+				t32 = asize;
+			} else {
+				inode->i_size =
+					le64_to_cpu(attr->nres.data_size);
+				t32 = le16_to_cpu(attr->nres.run_off);
+			}
+
+			/* Looks like normal symlink */
+			ni->i_valid = inode->i_size;
+
+			/* Clear directory bit */
+			if (ni->ni_flags & NI_FLAG_DIR) {
+				indx_clear(&ni->dir);
+				memset(&ni->dir, 0, sizeof(ni->dir));
+				ni->ni_flags &= ~NI_FLAG_DIR;
+			} else {
+				run_close(&ni->file.run);
+			}
+			mode = S_IFLNK | 0777;
+			is_dir = false;
+			if (attr->non_res) {
+				run = &ni->file.run;
+				goto attr_unpack_run; // double break
+			}
+			break;
+
+		case REPARSE_COMPRESSED:
+			break;
+
+		case REPARSE_DEDUPLICATED:
+			break;
+		}
+		goto next_attr;
+
+	case ATTR_EA_INFO:
+		if (!attr->name_len &&
+		    resident_data_ex(attr, sizeof(struct EA_INFO)))
+			ni->ni_flags |= NI_FLAG_EA;
+		goto next_attr;
+
+	default:
+		goto next_attr;
+	}
+
+attr_unpack_run:
+	roff = le16_to_cpu(attr->nres.run_off);
+
+	t64 = le64_to_cpu(attr->nres.svcn);
+	err = run_unpack_ex(run, sbi, ino, t64, le64_to_cpu(attr->nres.evcn),
+			    t64, Add2Ptr(attr, roff), asize - roff);
+	if (err < 0)
+		goto out;
+	err = 0;
+	goto next_attr;
+
+end_enum:
+
+	if (!std5)
+		goto out;
+
+	if (!is_match && name) {
+		/* reuse rec as buffer for ascii name */
+		err = -ENOENT;
+		goto out;
+	}
+
+	if (std5->fa & FILE_ATTRIBUTE_READONLY)
+		mode &= ~0222;
+
+	/* Setup 'uid' and 'gid' */
+	inode->i_uid = sbi->options.fs_uid;
+	inode->i_gid = sbi->options.fs_gid;
+
+	if (!names) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (S_ISDIR(mode)) {
+		ni->std_fa |= FILE_ATTRIBUTE_DIRECTORY;
+
+		/*
+		 * dot and dot-dot should be included in count but was not
+		 * included in enumeration.
+		 * Usually a hard links to directories are disabled
+		 */
+		set_nlink(inode, 1);
+		inode->i_op = &ntfs_dir_inode_operations;
+		inode->i_fop = &ntfs_dir_operations;
+		ni->i_valid = 0;
+	} else if (S_ISLNK(mode)) {
+		ni->std_fa &= ~FILE_ATTRIBUTE_DIRECTORY;
+		inode->i_op = &ntfs_link_inode_operations;
+		inode->i_fop = NULL;
+		inode_nohighmem(inode); // ??
+		set_nlink(inode, names);
+	} else if (S_ISREG(mode)) {
+		ni->std_fa &= ~FILE_ATTRIBUTE_DIRECTORY;
+
+		set_nlink(inode, names);
+
+		inode->i_op = &ntfs_file_inode_operations;
+		inode->i_fop = &ntfs_file_operations;
+		inode->i_mapping->a_ops =
+			is_compressed(ni) ? &ntfs_aops_cmpr : &ntfs_aops;
+
+		if (ino != MFT_REC_MFT)
+			init_rwsem(&ni->file.run_lock);
+	} else if (fname && fname->home.low == cpu_to_le32(MFT_REC_EXTEND) &&
+		   fname->home.seq == cpu_to_le16(MFT_REC_EXTEND)) {
+		/* Records in $Extend are not a files or general directories */
+	} else {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if ((sbi->options.sys_immutable &&
+	     (std5->fa & FILE_ATTRIBUTE_SYSTEM)) &&
+	    !S_ISFIFO(mode) && !S_ISSOCK(mode) && !S_ISLNK(mode)) {
+		inode->i_flags |= S_IMMUTABLE;
+	} else {
+		inode->i_flags &= ~S_IMMUTABLE;
+	}
+
+	inode->i_mode = mode;
+	if (!(ni->ni_flags & NI_FLAG_EA)) {
+		/* if no xattr then no security (stored in xattr) */
+		inode->i_flags |= S_NOSEC;
+	}
+
+Ok:
+	if (ino == MFT_REC_MFT && !sb->s_root)
+		sbi->mft.ni = NULL;
+
+	unlock_new_inode(inode);
+
+	return inode;
+
+out:
+	if (ino == MFT_REC_MFT && !sb->s_root)
+		sbi->mft.ni = NULL;
+
+	iget_failed(inode);
+	return ERR_PTR(err);
+}
+
+/* returns 1 if match */
+static int ntfs_test_inode(struct inode *inode, const struct MFT_REF *ref)
+{
+	return ino_get(ref) == inode->i_ino;
+}
+
+static int ntfs_set_inode(struct inode *inode, const struct MFT_REF *ref)
+{
+	inode->i_ino = ino_get(ref);
+
+	return 0;
+}
+
+struct inode *ntfs_iget5(struct super_block *sb, const struct MFT_REF *ref,
+			 const struct cpu_str *name)
+{
+	struct inode *inode;
+
+	inode = iget5_locked(sb, ino_get(ref),
+			     (int (*)(struct inode *, void *))ntfs_test_inode,
+			     (int (*)(struct inode *, void *))ntfs_set_inode,
+			     (void *)ref);
+	if (unlikely(!inode))
+		return ERR_PTR(-ENOMEM);
+
+	/* If this is a freshly allocated inode, need to read it now. */
+	if (inode->i_state & I_NEW)
+		inode = ntfs_read_mft(inode, name, ref);
+	else if (ref->seq != ntfs_i(inode)->mi.mrec->seq) {
+		/* inode overlaps? */
+		make_bad_inode(inode);
+	}
+
+	return inode;
+}
+
+enum get_block_ctx {
+	GET_BLOCK_GENERAL = 0,
+	GET_BLOCK_WRITE_BEGIN = 1,
+	GET_BLOCK_DIRECT_IO_R = 2,
+	GET_BLOCK_DIRECT_IO_W = 3,
+	GET_BLOCK_BMAP = 4,
+};
+
+static noinline int ntfs_get_block_vbo(struct inode *inode, u64 vbo,
+				       struct buffer_head *bh, int create,
+				       enum get_block_ctx ctx)
+{
+	struct super_block *sb = inode->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	struct page *page = bh->b_page;
+	u8 cluster_bits = sbi->cluster_bits;
+	u32 block_size = sb->s_blocksize;
+	u64 bytes, lbo, valid;
+	u32 off;
+	int err;
+	CLST vcn, lcn, len;
+	bool new;
+
+	/*clear previous state*/
+	clear_buffer_new(bh);
+	clear_buffer_uptodate(bh);
+
+	/* direct write uses 'create=0'*/
+	if (!create && vbo >= ni->i_valid) {
+		/* out of valid */
+		return 0;
+	}
+
+	if (vbo >= inode->i_size) {
+		/* out of size */
+		return 0;
+	}
+
+	if (is_resident(ni)) {
+		ni_lock(ni);
+		err = attr_data_read_resident(ni, page);
+		ni_unlock(ni);
+
+		if (!err)
+			set_buffer_uptodate(bh);
+		bh->b_size = block_size;
+		return err;
+	}
+
+	vcn = vbo >> cluster_bits;
+	off = vbo & sbi->cluster_mask;
+	new = false;
+
+	err = attr_data_get_block(ni, vcn, 1, &lcn, &len, create ? &new : NULL);
+	if (err)
+		goto out;
+
+	if (!len)
+		return 0;
+
+	bytes = ((u64)len << cluster_bits) - off;
+
+	if (lcn == SPARSE_LCN) {
+		if (!create) {
+			if (bh->b_size > bytes)
+				bh->b_size = bytes;
+
+			return 0;
+		}
+		WARN_ON(1);
+	}
+
+	if (new) {
+		set_buffer_new(bh);
+		if ((len << cluster_bits) > block_size)
+			ntfs_sparse_cluster(inode, page, vcn, len);
+	}
+
+	lbo = ((u64)lcn << cluster_bits) + off;
+
+	set_buffer_mapped(bh);
+	bh->b_bdev = sb->s_bdev;
+	bh->b_blocknr = lbo >> sb->s_blocksize_bits;
+
+	valid = ni->i_valid;
+
+	if (ctx == GET_BLOCK_DIRECT_IO_W) {
+		/*ntfs_direct_IO will update ni->i_valid */
+		if (vbo >= valid)
+			set_buffer_new(bh);
+	} else if (create) {
+		/*normal write*/
+		if (vbo >= valid) {
+			set_buffer_new(bh);
+			if (bytes > bh->b_size)
+				bytes = bh->b_size;
+			ni->i_valid = vbo + bytes;
+			mark_inode_dirty(inode);
+		}
+	} else if (valid >= inode->i_size) {
+		/* normal read of normal file*/
+	} else if (vbo >= valid) {
+		/* read out of valid data*/
+		/* should never be here 'cause already checked */
+		clear_buffer_mapped(bh);
+	} else if (vbo + bytes <= valid) {
+		/* normal read */
+	} else if (vbo + block_size <= valid) {
+		/* normal short read */
+		bytes = block_size;
+	} else {
+		/*
+		 * read across valid size: vbo < valid && valid < vbo + block_size
+		 */
+		u32 voff = valid - vbo;
+
+		bh->b_size = bytes = block_size;
+		off = vbo & (PAGE_SIZE - 1);
+		set_bh_page(bh, page, off);
+		ll_rw_block(REQ_OP_READ, 0, 1, &bh);
+		wait_on_buffer(bh);
+		/* Uhhuh. Read error. Complain and punt. */
+		if (!buffer_uptodate(bh)) {
+			err = -EIO;
+			goto out;
+		}
+		zero_user_segment(page, off + voff, off + block_size);
+	}
+
+	if (bh->b_size > bytes)
+		bh->b_size = bytes;
+
+#ifndef __LP64__
+	if (ctx == GET_BLOCK_DIRECT_IO_W || ctx == GET_BLOCK_DIRECT_IO_R) {
+		static_assert(sizeof(size_t) < sizeof(loff_t));
+		if (bytes > 0x40000000u)
+			bh->b_size = 0x40000000u;
+	}
+#endif
+
+	return 0;
+
+out:
+	return err;
+}
+
+int ntfs_get_block(struct inode *inode, sector_t vbn,
+		   struct buffer_head *bh_result, int create)
+{
+	return ntfs_get_block_vbo(inode, (u64)vbn << inode->i_blkbits,
+				  bh_result, create, GET_BLOCK_GENERAL);
+}
+
+static int ntfs_get_block_bmap(struct inode *inode, sector_t vsn,
+			       struct buffer_head *bh_result, int create)
+{
+	return ntfs_get_block_vbo(inode,
+				  (u64)vsn << inode->i_sb->s_blocksize_bits,
+				  bh_result, create, GET_BLOCK_BMAP);
+}
+
+static sector_t ntfs_bmap(struct address_space *mapping, sector_t block)
+{
+	return generic_block_bmap(mapping, block, ntfs_get_block_bmap);
+}
+
+static int ntfs_readpage(struct file *file, struct page *page)
+{
+	int err;
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+
+	if (is_resident(ni)) {
+		ni_lock(ni);
+		err = attr_data_read_resident(ni, page);
+		ni_unlock(ni);
+		if (err != E_NTFS_NONRESIDENT) {
+			unlock_page(page);
+			return err;
+		}
+	}
+
+	if (is_compressed(ni)) {
+		ni_lock(ni);
+		err = ni_readpage_cmpr(ni, page);
+		ni_unlock(ni);
+		return err;
+	}
+
+	/* normal + sparse files */
+	return mpage_readpage(page, ntfs_get_block);
+}
+
+static void ntfs_readahead(struct readahead_control *rac)
+{
+	struct address_space *mapping = rac->mapping;
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	u64 valid;
+	loff_t pos;
+
+	if (is_resident(ni)) {
+		/* no readahead for resident */
+		return;
+	}
+
+	if (is_compressed(ni)) {
+		/* no readahead for compressed */
+		return;
+	}
+
+	valid = ni->i_valid;
+	pos = readahead_pos(rac);
+
+	if (valid < i_size_read(inode) && pos <= valid &&
+	    valid < pos + readahead_length(rac)) {
+		/* range cross 'valid'. read it page by page */
+		return;
+	}
+
+	mpage_readahead(rac, ntfs_get_block);
+}
+
+static int ntfs_get_block_direct_IO_R(struct inode *inode, sector_t iblock,
+				      struct buffer_head *bh_result, int create)
+{
+	return ntfs_get_block_vbo(inode, (u64)iblock << inode->i_blkbits,
+				  bh_result, create, GET_BLOCK_DIRECT_IO_R);
+}
+
+static int ntfs_get_block_direct_IO_W(struct inode *inode, sector_t iblock,
+				      struct buffer_head *bh_result, int create)
+{
+	return ntfs_get_block_vbo(inode, (u64)iblock << inode->i_blkbits,
+				  bh_result, create, GET_BLOCK_DIRECT_IO_W);
+}
+
+static ssize_t ntfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct file *file = iocb->ki_filp;
+	struct address_space *mapping = file->f_mapping;
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	size_t count = iov_iter_count(iter);
+	loff_t vbo = iocb->ki_pos;
+	loff_t end = vbo + count;
+	int wr = iov_iter_rw(iter) & WRITE;
+	const struct iovec *iov = iter->iov;
+	unsigned long nr_segs = iter->nr_segs;
+	loff_t valid;
+	ssize_t ret;
+
+	if (is_resident(ni)) {
+		/*switch to buffered write*/
+		ret = 0;
+		goto out;
+	}
+
+	ret = blockdev_direct_IO(iocb, inode, iter,
+				 wr ? ntfs_get_block_direct_IO_W :
+				      ntfs_get_block_direct_IO_R);
+	valid = ni->i_valid;
+	if (wr) {
+		if (ret <= 0)
+			goto out;
+
+		vbo += ret;
+		if (vbo > valid && !S_ISBLK(inode->i_mode)) {
+			ni->i_valid = vbo;
+			mark_inode_dirty(inode);
+		}
+	} else if (vbo < valid && valid < end) {
+		/* fix page */
+		unsigned long uaddr = ~0ul;
+		struct page *page;
+		long i, npages;
+		size_t dvbo = valid - vbo;
+		size_t off = 0;
+
+		/*Find user address*/
+		for (i = 0; i < nr_segs; i++) {
+			if (off <= dvbo && dvbo < off + iov[i].iov_len) {
+				uaddr = (unsigned long)iov[i].iov_base + dvbo -
+					off;
+				break;
+			}
+			off += iov[i].iov_len;
+		}
+
+		if (uaddr == ~0ul)
+			goto fix_error;
+
+		npages = get_user_pages_unlocked(uaddr, 1, &page, FOLL_WRITE);
+
+		if (npages <= 0)
+			goto fix_error;
+
+		zero_user_segment(page, valid & (PAGE_SIZE - 1), PAGE_SIZE);
+		put_page(page);
+	}
+
+out:
+	return ret;
+fix_error:
+	ntfs_inode_warn(inode, "file garbage at 0x%llx", valid);
+	goto out;
+}
+
+int ntfs_set_size(struct inode *inode, u64 new_size)
+{
+	struct super_block *sb = inode->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	int err;
+
+	/* Check for maximum file size */
+	if (is_sparsed(ni) || is_compressed(ni)) {
+		if (new_size > sbi->maxbytes_sparse) {
+			err = -EFBIG;
+			goto out;
+		}
+	} else if (new_size > sbi->maxbytes) {
+		err = -EFBIG;
+		goto out;
+	}
+
+	ni_lock(ni);
+	down_write(&ni->file.run_lock);
+
+	err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size,
+			    &ni->i_valid, true, NULL);
+
+	up_write(&ni->file.run_lock);
+	ni_unlock(ni);
+
+	mark_inode_dirty(inode);
+
+out:
+	return err;
+}
+
+static int ntfs_writepage(struct page *page, struct writeback_control *wbc)
+{
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	int err;
+
+	if (is_resident(ni)) {
+		ni_lock(ni);
+		err = attr_data_write_resident(ni, page);
+		ni_unlock(ni);
+		if (err != E_NTFS_NONRESIDENT) {
+			unlock_page(page);
+			return err;
+		}
+	}
+
+	return block_write_full_page(page, ntfs_get_block, wbc);
+}
+
+static int ntfs_writepages(struct address_space *mapping,
+			   struct writeback_control *wbc)
+{
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	/* redirect call to 'ntfs_writepage' for resident files*/
+	get_block_t *get_block = is_resident(ni) ? NULL : &ntfs_get_block;
+
+	return mpage_writepages(mapping, wbc, get_block);
+}
+
+static int ntfs_get_block_write_begin(struct inode *inode, sector_t vbn,
+				      struct buffer_head *bh_result, int create)
+{
+	return ntfs_get_block_vbo(inode, (u64)vbn << inode->i_blkbits,
+				  bh_result, create, GET_BLOCK_WRITE_BEGIN);
+}
+
+static int ntfs_write_begin(struct file *file, struct address_space *mapping,
+			    loff_t pos, u32 len, u32 flags, struct page **pagep,
+			    void **fsdata)
+{
+	int err;
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+
+	*pagep = NULL;
+	if (is_resident(ni)) {
+		struct page *page = grab_cache_page_write_begin(
+			mapping, pos >> PAGE_SHIFT, flags);
+
+		if (!page) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		ni_lock(ni);
+		err = attr_data_read_resident(ni, page);
+		ni_unlock(ni);
+
+		if (!err) {
+			*pagep = page;
+			goto out;
+		}
+		unlock_page(page);
+		put_page(page);
+
+		if (err != E_NTFS_NONRESIDENT)
+			goto out;
+	}
+
+	err = block_write_begin(mapping, pos, len, flags, pagep,
+				ntfs_get_block_write_begin);
+
+out:
+	return err;
+}
+
+/* address_space_operations::write_end */
+static int ntfs_write_end(struct file *file, struct address_space *mapping,
+			  loff_t pos, u32 len, u32 copied, struct page *page,
+			  void *fsdata)
+
+{
+	struct inode *inode = mapping->host;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	u64 valid = ni->i_valid;
+	bool dirty = false;
+	int err;
+
+	if (is_resident(ni)) {
+		ni_lock(ni);
+		err = attr_data_write_resident(ni, page);
+		ni_unlock(ni);
+		if (!err) {
+			dirty = true;
+			/* clear any buffers in page*/
+			if (page_has_buffers(page)) {
+				struct buffer_head *head, *bh;
+
+				bh = head = page_buffers(page);
+				do {
+					clear_buffer_dirty(bh);
+					clear_buffer_mapped(bh);
+					set_buffer_uptodate(bh);
+				} while (head != (bh = bh->b_this_page));
+			}
+			SetPageUptodate(page);
+			err = copied;
+		}
+		unlock_page(page);
+		put_page(page);
+	} else {
+		err = generic_write_end(file, mapping, pos, len, copied, page,
+					fsdata);
+	}
+
+	if (err >= 0) {
+		if (!(ni->std_fa & FILE_ATTRIBUTE_ARCHIVE)) {
+			inode->i_ctime = inode->i_mtime = current_time(inode);
+			ni->std_fa |= FILE_ATTRIBUTE_ARCHIVE;
+			dirty = true;
+		}
+
+		if (valid != ni->i_valid) {
+			/* ni->i_valid is changed in ntfs_get_block_vbo */
+			dirty = true;
+		}
+
+		if (dirty)
+			mark_inode_dirty(inode);
+	}
+
+	return err;
+}
+
+int reset_log_file(struct inode *inode)
+{
+	int err;
+	loff_t pos = 0;
+	u32 log_size = inode->i_size;
+	struct address_space *mapping = inode->i_mapping;
+
+	for (;;) {
+		u32 len;
+		void *kaddr;
+		struct page *page;
+
+		len = pos + PAGE_SIZE > log_size ? (log_size - pos) : PAGE_SIZE;
+
+		err = block_write_begin(mapping, pos, len, 0, &page,
+					ntfs_get_block_write_begin);
+		if (err)
+			goto out;
+
+		kaddr = kmap_atomic(page);
+		memset(kaddr, -1, len);
+		kunmap_atomic(kaddr);
+		flush_dcache_page(page);
+
+		err = block_write_end(NULL, mapping, pos, len, len, page, NULL);
+		if (err < 0)
+			goto out;
+		pos += len;
+
+		if (pos >= log_size)
+			break;
+		balance_dirty_pages_ratelimited(mapping);
+	}
+out:
+	mark_inode_dirty_sync(inode);
+
+	return err;
+}
+
+int ntfs3_write_inode(struct inode *inode, struct writeback_control *wbc)
+{
+	return _ni_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL);
+}
+
+int ntfs_sync_inode(struct inode *inode)
+{
+	return _ni_write_inode(inode, 1);
+}
+
+/*
+ * helper function for ntfs_flush_inodes.  This writes both the inode
+ * and the file data blocks, waiting for in flight data blocks before
+ * the start of the call.  It does not wait for any io started
+ * during the call
+ */
+static int writeback_inode(struct inode *inode)
+{
+	int ret = sync_inode_metadata(inode, 0);
+
+	if (!ret)
+		ret = filemap_fdatawrite(inode->i_mapping);
+	return ret;
+}
+
+/*
+ * write data and metadata corresponding to i1 and i2.  The io is
+ * started but we do not wait for any of it to finish.
+ *
+ * filemap_flush is used for the block device, so if there is a dirty
+ * page for a block already in flight, we will not wait and start the
+ * io over again
+ */
+int ntfs_flush_inodes(struct super_block *sb, struct inode *i1,
+		      struct inode *i2)
+{
+	int ret = 0;
+
+	if (i1)
+		ret = writeback_inode(i1);
+	if (!ret && i2)
+		ret = writeback_inode(i2);
+	if (!ret)
+		ret = filemap_flush(sb->s_bdev->bd_inode->i_mapping);
+	return ret;
+}
+
+int inode_write_data(struct inode *inode, const void *data, size_t bytes)
+{
+	pgoff_t idx;
+
+	/* Write non resident data */
+	for (idx = 0; bytes; idx++) {
+		size_t op = bytes > PAGE_SIZE ? PAGE_SIZE : bytes;
+		struct page *page = ntfs_map_page(inode->i_mapping, idx);
+
+		if (IS_ERR(page))
+			return PTR_ERR(page);
+
+		lock_page(page);
+		WARN_ON(!PageUptodate(page));
+		ClearPageUptodate(page);
+
+		memcpy(page_address(page), data, op);
+
+		flush_dcache_page(page);
+		SetPageUptodate(page);
+		unlock_page(page);
+
+		ntfs_unmap_page(page);
+
+		bytes -= op;
+		data = Add2Ptr(data, PAGE_SIZE);
+	}
+	return 0;
+}
+
+/*
+ * number of bytes to for REPARSE_DATA_BUFFER(IO_REPARSE_TAG_SYMLINK)
+ * for unicode string of 'uni_len' length
+ */
+static inline u32 ntfs_reparse_bytes(u32 uni_len)
+{
+	/* header + unicode string + decorated unicode string */
+	return sizeof(short) * (2 * uni_len + 4) +
+	       offsetof(struct REPARSE_DATA_BUFFER,
+			SymbolicLinkReparseBuffer.PathBuffer);
+}
+
+static struct REPARSE_DATA_BUFFER *
+ntfs_create_reparse_buffer(struct ntfs_sb_info *sbi, const char *symname,
+			   u32 size, u16 *nsize)
+{
+	int i, err;
+	struct REPARSE_DATA_BUFFER *rp;
+	__le16 *rp_name;
+	typeof(rp->SymbolicLinkReparseBuffer) *rs;
+
+	rp = ntfs_zalloc(ntfs_reparse_bytes(2 * size + 2));
+	if (!rp)
+		return ERR_PTR(-ENOMEM);
+
+	rs = &rp->SymbolicLinkReparseBuffer;
+	rp_name = rs->PathBuffer;
+
+	/* Convert link name to utf16 */
+	err = ntfs_nls_to_utf16(sbi, symname, size,
+				(struct cpu_str *)(rp_name - 1), 2 * size,
+				UTF16_LITTLE_ENDIAN);
+	if (err < 0)
+		goto out;
+
+	/* err = the length of unicode name of symlink */
+	*nsize = ntfs_reparse_bytes(err);
+
+	if (*nsize > sbi->reparse.max_size) {
+		err = -EFBIG;
+		goto out;
+	}
+
+	/* translate linux '/' into windows '\' */
+	for (i = 0; i < err; i++) {
+		if (rp_name[i] == cpu_to_le16('/'))
+			rp_name[i] = cpu_to_le16('\\');
+	}
+
+	rp->ReparseTag = IO_REPARSE_TAG_SYMLINK;
+	rp->ReparseDataLength =
+		cpu_to_le16(*nsize - offsetof(struct REPARSE_DATA_BUFFER,
+					      SymbolicLinkReparseBuffer));
+
+	/* PrintName + SubstituteName */
+	rs->SubstituteNameOffset = cpu_to_le16(sizeof(short) * err);
+	rs->SubstituteNameLength = cpu_to_le16(sizeof(short) * err + 8);
+	rs->PrintNameLength = rs->SubstituteNameOffset;
+
+	/*
+	 * TODO: use relative path if possible to allow windows to parse this path
+	 * 0-absolute path 1- relative path (SYMLINK_FLAG_RELATIVE)
+	 */
+	rs->Flags = 0;
+
+	memmove(rp_name + err + 4, rp_name, sizeof(short) * err);
+
+	/* decorate SubstituteName */
+	rp_name += err;
+	rp_name[0] = cpu_to_le16('\\');
+	rp_name[1] = cpu_to_le16('?');
+	rp_name[2] = cpu_to_le16('?');
+	rp_name[3] = cpu_to_le16('\\');
+
+	return rp;
+out:
+	ntfs_free(rp);
+	return ERR_PTR(err);
+}
+
+int ntfs_create_inode(struct inode *dir, struct dentry *dentry,
+		      const struct cpu_str *uni, umode_t mode, dev_t dev,
+		      const char *symname, u32 size, int excl,
+		      struct ntfs_fnd *fnd, struct inode **new_inode)
+{
+	int err;
+	struct super_block *sb = dir->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	const struct qstr *name = &dentry->d_name;
+	CLST ino = 0;
+	struct ntfs_inode *dir_ni = ntfs_i(dir);
+	struct ntfs_inode *ni = NULL;
+	struct inode *inode = NULL;
+	struct ATTRIB *attr;
+	struct ATTR_STD_INFO5 *std5;
+	struct ATTR_FILE_NAME *fname;
+	struct MFT_REC *rec;
+	u32 asize, dsize, sd_size;
+	enum FILE_ATTRIBUTE fa;
+	__le32 security_id = SECURITY_ID_INVALID;
+	CLST vcn;
+	const void *sd;
+	u16 t16, nsize = 0, aid = 0;
+	struct INDEX_ROOT *root, *dir_root;
+	struct NTFS_DE *e, *new_de = NULL;
+	struct REPARSE_DATA_BUFFER *rp = NULL;
+	bool is_dir = S_ISDIR(mode);
+	bool is_link = S_ISLNK(mode);
+	bool rp_inserted = false;
+	bool is_sp = S_ISCHR(mode) || S_ISBLK(mode) || S_ISFIFO(mode) ||
+		     S_ISSOCK(mode);
+
+	if (is_sp)
+		return -EOPNOTSUPP;
+
+	dir_root = indx_get_root(&dir_ni->dir, dir_ni, NULL, NULL);
+	if (!dir_root)
+		return -EINVAL;
+
+	if (is_dir) {
+		/* use parent's directory attributes */
+		fa = dir_ni->std_fa | FILE_ATTRIBUTE_DIRECTORY |
+		     FILE_ATTRIBUTE_ARCHIVE;
+	} else if (is_link) {
+		/* It is good idea that link should be the same type (file/dir) as target */
+		fa = FILE_ATTRIBUTE_REPARSE_POINT;
+
+		/*
+		 * linux: there are dir/file/symlink and so on
+		 * NTFS: symlinks are "dir + reparse" or "file + reparse"
+		 * It is good idea to create:
+		 * dir + reparse if 'symname' points to directory
+		 * or
+		 * file + reparse if 'symname' points to file
+		 * Unfortunately kern_path hangs if symname contains 'dir'
+		 */
+
+		/*
+		 *	struct path path;
+		 *
+		 *	if (!kern_path(symname, LOOKUP_FOLLOW, &path)){
+		 *		struct inode *target = d_inode(path.dentry);
+		 *
+		 *		if (S_ISDIR(target->i_mode))
+		 *			fa |= FILE_ATTRIBUTE_DIRECTORY;
+		 *		// if ( target->i_sb == sb ){
+		 *		//	use relative path?
+		 *		// }
+		 *		path_put(&path);
+		 *	}
+		 */
+	} else if (sbi->options.sparse) {
+		/* sparsed regular file, cause option 'sparse' */
+		fa = FILE_ATTRIBUTE_SPARSE_FILE | FILE_ATTRIBUTE_ARCHIVE;
+	} else if (dir_ni->std_fa & FILE_ATTRIBUTE_COMPRESSED) {
+		/* compressed regular file, if parent is compressed */
+		fa = FILE_ATTRIBUTE_COMPRESSED | FILE_ATTRIBUTE_ARCHIVE;
+	} else {
+		/* regular file, default attributes */
+		fa = FILE_ATTRIBUTE_ARCHIVE;
+	}
+
+	if (!(mode & 0222))
+		fa |= FILE_ATTRIBUTE_READONLY;
+
+	/* allocate PATH_MAX bytes */
+	new_de = __getname();
+	if (!new_de) {
+		err = -ENOMEM;
+		goto out1;
+	}
+
+	/*mark rw ntfs as dirty. it will be cleared at umount*/
+	ntfs_set_state(sbi, NTFS_DIRTY_DIRTY);
+
+	/* Step 1: allocate and fill new mft record */
+	err = ntfs_look_free_mft(sbi, &ino, false, NULL, NULL);
+	if (err)
+		goto out2;
+
+	ni = ntfs_new_inode(sbi, ino, fa & FILE_ATTRIBUTE_DIRECTORY);
+	if (IS_ERR(ni)) {
+		err = PTR_ERR(ni);
+		ni = NULL;
+		goto out3;
+	}
+	inode = &ni->vfs_inode;
+
+	inode->i_atime = inode->i_mtime = inode->i_ctime = ni->i_crtime =
+		current_time(inode);
+
+	rec = ni->mi.mrec;
+	rec->hard_links = cpu_to_le16(1);
+	attr = Add2Ptr(rec, le16_to_cpu(rec->attr_off));
+
+	/* Get default security id */
+	sd = s_default_security;
+	sd_size = sizeof(s_default_security);
+
+	if (is_ntfs3(sbi)) {
+		security_id = dir_ni->std_security_id;
+		if (le32_to_cpu(security_id) < SECURITY_ID_FIRST) {
+			security_id = sbi->security.def_security_id;
+
+			if (security_id == SECURITY_ID_INVALID &&
+			    !ntfs_insert_security(sbi, sd, sd_size,
+						  &security_id, NULL))
+				sbi->security.def_security_id = security_id;
+		}
+	}
+
+	/* Insert standard info */
+	std5 = Add2Ptr(attr, SIZEOF_RESIDENT);
+
+	if (security_id == SECURITY_ID_INVALID) {
+		dsize = sizeof(struct ATTR_STD_INFO);
+	} else {
+		dsize = sizeof(struct ATTR_STD_INFO5);
+		std5->security_id = security_id;
+		ni->std_security_id = security_id;
+	}
+	asize = SIZEOF_RESIDENT + dsize;
+
+	attr->type = ATTR_STD;
+	attr->size = cpu_to_le32(asize);
+	attr->id = cpu_to_le16(aid++);
+	attr->res.data_off = SIZEOF_RESIDENT_LE;
+	attr->res.data_size = cpu_to_le32(dsize);
+
+	std5->cr_time = std5->m_time = std5->c_time = std5->a_time =
+		kernel2nt(&inode->i_atime);
+
+	ni->std_fa = fa;
+	std5->fa = fa;
+
+	attr = Add2Ptr(attr, asize);
+
+	/* Insert file name */
+	err = fill_name_de(sbi, new_de, name, uni);
+	if (err)
+		goto out4;
+
+	fname = (struct ATTR_FILE_NAME *)(new_de + 1);
+
+	new_de->ref.low = cpu_to_le32(ino);
+#ifdef NTFS3_64BIT_CLUSTER
+	new_de->ref.high = cpu_to_le16(ino >> 32);
+	fname->home.high = cpu_to_le16(dir->i_ino >> 32);
+#endif
+	new_de->ref.seq = rec->seq;
+
+	fname->home.low = cpu_to_le32(dir->i_ino & 0xffffffff);
+	fname->home.seq = dir_ni->mi.mrec->seq;
+
+	fname->dup.cr_time = fname->dup.m_time = fname->dup.c_time =
+		fname->dup.a_time = std5->cr_time;
+	fname->dup.alloc_size = fname->dup.data_size = 0;
+	fname->dup.fa = std5->fa;
+	fname->dup.ea_size = fname->dup.reparse = 0;
+
+	dsize = le16_to_cpu(new_de->key_size);
+	asize = QuadAlign(SIZEOF_RESIDENT + dsize);
+
+	attr->type = ATTR_NAME;
+	attr->size = cpu_to_le32(asize);
+	attr->res.data_off = SIZEOF_RESIDENT_LE;
+	attr->res.flags = RESIDENT_FLAG_INDEXED;
+	attr->id = cpu_to_le16(aid++);
+	attr->res.data_size = cpu_to_le32(dsize);
+	memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), fname, dsize);
+
+	attr = Add2Ptr(attr, asize);
+
+	if (security_id == SECURITY_ID_INVALID) {
+		/* Insert security attribute */
+		asize = SIZEOF_RESIDENT + QuadAlign(sd_size);
+
+		attr->type = ATTR_SECURE;
+		attr->size = cpu_to_le32(asize);
+		attr->id = cpu_to_le16(aid++);
+		attr->res.data_off = SIZEOF_RESIDENT_LE;
+		attr->res.data_size = cpu_to_le32(sd_size);
+		memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), sd, sd_size);
+
+		attr = Add2Ptr(attr, asize);
+	}
+
+	if (fa & FILE_ATTRIBUTE_DIRECTORY) {
+		/*
+		 * regular directory or symlink to directory
+		 * Create root attribute
+		 */
+		dsize = sizeof(struct INDEX_ROOT) + sizeof(struct NTFS_DE);
+		asize = sizeof(I30_NAME) + SIZEOF_RESIDENT + dsize;
+
+		attr->type = ATTR_ROOT;
+		attr->size = cpu_to_le32(asize);
+		attr->id = cpu_to_le16(aid++);
+
+		attr->name_len = ARRAY_SIZE(I30_NAME);
+		attr->name_off = SIZEOF_RESIDENT_LE;
+		attr->res.data_off =
+			cpu_to_le16(sizeof(I30_NAME) + SIZEOF_RESIDENT);
+		attr->res.data_size = cpu_to_le32(dsize);
+		memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), I30_NAME,
+		       sizeof(I30_NAME));
+
+		root = Add2Ptr(attr, sizeof(I30_NAME) + SIZEOF_RESIDENT);
+		memcpy(root, dir_root, offsetof(struct INDEX_ROOT, ihdr));
+		root->ihdr.de_off =
+			cpu_to_le32(sizeof(struct INDEX_HDR)); // 0x10
+		root->ihdr.used = cpu_to_le32(sizeof(struct INDEX_HDR) +
+					      sizeof(struct NTFS_DE));
+		root->ihdr.total = root->ihdr.used;
+
+		e = Add2Ptr(root, sizeof(struct INDEX_ROOT));
+		e->size = cpu_to_le16(sizeof(struct NTFS_DE));
+		e->flags = NTFS_IE_LAST;
+	} else if (is_link) {
+		/*
+		 * symlink to file
+		 * Create empty resident data attribute
+		 */
+		asize = SIZEOF_RESIDENT;
+
+		/* insert empty ATTR_DATA */
+		attr->type = ATTR_DATA;
+		attr->size = cpu_to_le32(SIZEOF_RESIDENT);
+		attr->id = cpu_to_le16(aid++);
+		attr->name_off = SIZEOF_RESIDENT_LE;
+		attr->res.data_off = SIZEOF_RESIDENT_LE;
+	} else {
+		/*
+		 * regular file
+		 */
+		attr->type = ATTR_DATA;
+		attr->id = cpu_to_le16(aid++);
+		/* Create empty non resident data attribute */
+		attr->non_res = 1;
+		attr->nres.evcn = cpu_to_le64(-1ll);
+		if (fa & FILE_ATTRIBUTE_SPARSE_FILE) {
+			attr->size = cpu_to_le32(SIZEOF_NONRESIDENT_EX + 8);
+			attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+			attr->flags = ATTR_FLAG_SPARSED;
+			asize = SIZEOF_NONRESIDENT_EX + 8;
+		} else if (fa & FILE_ATTRIBUTE_COMPRESSED) {
+			attr->size = cpu_to_le32(SIZEOF_NONRESIDENT_EX + 8);
+			attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+			attr->flags = ATTR_FLAG_COMPRESSED;
+			attr->nres.c_unit = COMPRESSION_UNIT;
+			asize = SIZEOF_NONRESIDENT_EX + 8;
+		} else {
+			attr->size = cpu_to_le32(SIZEOF_NONRESIDENT + 8);
+			attr->name_off = SIZEOF_NONRESIDENT_LE;
+			asize = SIZEOF_NONRESIDENT + 8;
+		}
+		attr->nres.run_off = attr->name_off;
+	}
+
+	if (is_dir) {
+		ni->ni_flags |= NI_FLAG_DIR;
+		err = indx_init(&ni->dir, sbi, attr, INDEX_MUTEX_I30);
+		if (err)
+			goto out4;
+	} else if (is_link) {
+		rp = ntfs_create_reparse_buffer(sbi, symname, size, &nsize);
+
+		if (IS_ERR(rp)) {
+			err = PTR_ERR(rp);
+			rp = NULL;
+			goto out4;
+		}
+
+		/*
+		 * Insert ATTR_REPARSE
+		 */
+		attr = Add2Ptr(attr, asize);
+		attr->type = ATTR_REPARSE;
+		attr->id = cpu_to_le16(aid++);
+
+		/* resident or non resident? */
+		asize = QuadAlign(SIZEOF_RESIDENT + nsize);
+		t16 = PtrOffset(rec, attr);
+
+		if (asize + t16 + 8 > sbi->record_size) {
+			CLST alen;
+			CLST clst = bytes_to_cluster(sbi, nsize);
+
+			/* bytes per runs */
+			t16 = sbi->record_size - t16 - SIZEOF_NONRESIDENT;
+
+			attr->non_res = 1;
+			attr->nres.evcn = cpu_to_le64(clst - 1);
+			attr->name_off = SIZEOF_NONRESIDENT_LE;
+			attr->nres.run_off = attr->name_off;
+			attr->nres.data_size = cpu_to_le64(nsize);
+			attr->nres.valid_size = attr->nres.data_size;
+			attr->nres.alloc_size =
+				cpu_to_le64(ntfs_up_cluster(sbi, nsize));
+
+			err = attr_allocate_clusters(sbi, &ni->file.run, 0, 0,
+						     clst, NULL, 0, &alen, 0,
+						     NULL);
+			if (err)
+				goto out5;
+
+			err = run_pack(&ni->file.run, 0, clst,
+				       Add2Ptr(attr, SIZEOF_NONRESIDENT), t16,
+				       &vcn);
+			if (err < 0)
+				goto out5;
+
+			if (vcn != clst) {
+				err = -EINVAL;
+				goto out5;
+			}
+
+			asize = SIZEOF_NONRESIDENT + QuadAlign(err);
+			inode->i_size = nsize;
+		} else {
+			attr->res.data_off = SIZEOF_RESIDENT_LE;
+			attr->res.data_size = cpu_to_le32(nsize);
+			memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), rp, nsize);
+			inode->i_size = nsize;
+			nsize = 0;
+		}
+
+		attr->size = cpu_to_le32(asize);
+
+		err = ntfs_insert_reparse(sbi, IO_REPARSE_TAG_SYMLINK,
+					  &new_de->ref);
+		if (err)
+			goto out5;
+
+		rp_inserted = true;
+	}
+
+	attr = Add2Ptr(attr, asize);
+	attr->type = ATTR_END;
+
+	rec->used = cpu_to_le32(PtrOffset(rec, attr) + 8);
+	rec->next_attr_id = cpu_to_le16(aid);
+
+	/* Step 2: Add new name in index */
+	err = indx_insert_entry(&dir_ni->dir, dir_ni, new_de, sbi, fnd);
+	if (err)
+		goto out6;
+
+	/* Update current directory record */
+	mark_inode_dirty(dir);
+
+	/* Fill vfs inode fields */
+	inode->i_uid = sbi->options.uid ? sbi->options.fs_uid : current_fsuid();
+	inode->i_gid =
+		sbi->options.gid ?
+			sbi->options.fs_gid :
+			(dir->i_mode & S_ISGID) ? dir->i_gid : current_fsgid();
+	inode->i_generation = le16_to_cpu(rec->seq);
+
+	dir->i_mtime = dir->i_ctime = inode->i_atime;
+
+	if (is_dir) {
+		if (dir->i_mode & S_ISGID)
+			mode |= S_ISGID;
+		inode->i_op = &ntfs_dir_inode_operations;
+		inode->i_fop = &ntfs_dir_operations;
+	} else if (is_link) {
+		inode->i_op = &ntfs_link_inode_operations;
+		inode->i_fop = NULL;
+		inode->i_mapping->a_ops = &ntfs_aops;
+	} else {
+		inode->i_op = &ntfs_file_inode_operations;
+		inode->i_fop = &ntfs_file_operations;
+		inode->i_mapping->a_ops =
+			is_compressed(ni) ? &ntfs_aops_cmpr : &ntfs_aops;
+		init_rwsem(&ni->file.run_lock);
+	}
+
+	inode->i_mode = mode;
+
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+	if (!is_link && (sb->s_flags & SB_POSIXACL)) {
+		err = ntfs_init_acl(inode, dir);
+		if (err)
+			goto out6;
+	} else
+#endif
+	{
+		inode->i_flags |= S_NOSEC;
+	}
+
+	/* Write non resident data */
+	if (nsize) {
+		err = ntfs_sb_write_run(sbi, &ni->file.run, 0, rp, nsize);
+		if (err)
+			goto out7;
+	}
+
+	/* call 'd_instantiate' after inode->i_op is set but before finish_open */
+	d_instantiate(dentry, inode);
+
+	mark_inode_dirty(inode);
+	mark_inode_dirty(dir);
+
+	/* normal exit */
+	goto out2;
+
+out7:
+
+	/* undo 'indx_insert_entry' */
+	indx_delete_entry(&dir_ni->dir, dir_ni, new_de + 1,
+			  le16_to_cpu(new_de->key_size), sbi);
+out6:
+	if (rp_inserted)
+		ntfs_remove_reparse(sbi, IO_REPARSE_TAG_SYMLINK, &new_de->ref);
+
+out5:
+	if (is_dir || run_is_empty(&ni->file.run))
+		goto out4;
+
+	run_deallocate(sbi, &ni->file.run, false);
+
+out4:
+	clear_rec_inuse(rec);
+	clear_nlink(inode);
+	ni->mi.dirty = false;
+	discard_new_inode(inode);
+out3:
+	ntfs_mark_rec_free(sbi, ino);
+
+out2:
+	__putname(new_de);
+	ntfs_free(rp);
+
+out1:
+	if (err)
+		return err;
+
+	unlock_new_inode(inode);
+
+	*new_inode = inode;
+	return 0;
+}
+
+int ntfs_link_inode(struct inode *inode, struct dentry *dentry)
+{
+	int err;
+	struct inode *dir = d_inode(dentry->d_parent);
+	struct ntfs_inode *dir_ni = ntfs_i(dir);
+	struct ntfs_inode *ni = ntfs_i(inode);
+	struct super_block *sb = inode->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	const struct qstr *name = &dentry->d_name;
+	struct NTFS_DE *new_de = NULL;
+	struct ATTR_FILE_NAME *fname;
+	struct ATTRIB *attr;
+	u16 key_size;
+	struct INDEX_ROOT *dir_root;
+
+	dir_root = indx_get_root(&dir_ni->dir, dir_ni, NULL, NULL);
+	if (!dir_root)
+		return -EINVAL;
+
+	/* allocate PATH_MAX bytes */
+	new_de = __getname();
+	if (!new_de)
+		return -ENOMEM;
+
+	/*mark rw ntfs as dirty. it will be cleared at umount*/
+	ntfs_set_state(ni->mi.sbi, NTFS_DIRTY_DIRTY);
+
+	// Insert file name
+	err = fill_name_de(sbi, new_de, name, NULL);
+	if (err)
+		goto out;
+
+	key_size = le16_to_cpu(new_de->key_size);
+	fname = (struct ATTR_FILE_NAME *)(new_de + 1);
+
+	err = ni_insert_resident(ni, key_size, ATTR_NAME, NULL, 0, &attr, NULL);
+	if (err)
+		goto out;
+
+	new_de->ref.low = cpu_to_le32(inode->i_ino);
+#ifdef NTFS3_64BIT_CLUSTER
+	new_de->ref.high = cpu_to_le16(inode->i_ino >> 32);
+	fname->home.high = cpu_to_le16(dir->i_ino >> 32);
+#endif
+	new_de->ref.seq = ni->mi.mrec->seq;
+
+	fname->home.low = cpu_to_le32(dir->i_ino & 0xffffffff);
+	fname->home.seq = dir_ni->mi.mrec->seq;
+
+	fname->dup.cr_time = fname->dup.m_time = fname->dup.c_time =
+		fname->dup.a_time = kernel2nt(&inode->i_ctime);
+	fname->dup.alloc_size = fname->dup.data_size = 0;
+	fname->dup.fa = ni->std_fa;
+	fname->dup.ea_size = fname->dup.reparse = 0;
+
+	memcpy(Add2Ptr(attr, SIZEOF_RESIDENT), fname, key_size);
+
+	err = indx_insert_entry(&dir_ni->dir, dir_ni, new_de, sbi, NULL);
+	if (err)
+		goto out;
+
+	le16_add_cpu(&ni->mi.mrec->hard_links, 1);
+	ni->mi.dirty = true;
+
+out:
+	__putname(new_de);
+	return err;
+}
+
+/*
+ * ntfs_unlink_inode
+ *
+ * inode_operations::unlink
+ * inode_operations::rmdir
+ */
+int ntfs_unlink_inode(struct inode *dir, const struct dentry *dentry)
+{
+	int err;
+	struct super_block *sb = dir->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct inode *inode = d_inode(dentry);
+	struct ntfs_inode *ni = ntfs_i(inode);
+	const struct qstr *name = &dentry->d_name;
+	struct ntfs_inode *dir_ni = ntfs_i(dir);
+	struct ntfs_index *indx = &dir_ni->dir;
+	struct cpu_str *uni = NULL;
+	struct ATTR_FILE_NAME *fname;
+	u8 name_type;
+	struct ATTR_LIST_ENTRY *le;
+	struct MFT_REF ref;
+	bool is_dir = S_ISDIR(inode->i_mode);
+	struct INDEX_ROOT *dir_root;
+
+	dir_root = indx_get_root(indx, dir_ni, NULL, NULL);
+	if (!dir_root)
+		return -EINVAL;
+
+	ni_lock(ni);
+
+	if (is_dir && !dir_is_empty(inode)) {
+		err = -ENOTEMPTY;
+		goto out1;
+	}
+
+	if (ntfs_is_meta_file(sbi, inode->i_ino)) {
+		err = -EINVAL;
+		goto out1;
+	}
+
+	/* allocate PATH_MAX bytes */
+	uni = __getname();
+	if (!uni) {
+		err = -ENOMEM;
+		goto out1;
+	}
+
+	/* Convert input string to unicode */
+	err = ntfs_nls_to_utf16(sbi, name->name, name->len, uni, NTFS_NAME_LEN,
+				UTF16_HOST_ENDIAN);
+	if (err < 0)
+		goto out4;
+
+	/*mark rw ntfs as dirty. it will be cleared at umount*/
+	ntfs_set_state(sbi, NTFS_DIRTY_DIRTY);
+
+	/* find name in record */
+#ifdef NTFS3_64BIT_CLUSTER
+	ref.low = cpu_to_le32(dir->i_ino & 0xffffffff);
+	ref.high = cpu_to_le16(dir->i_ino >> 32);
+#else
+	ref.low = cpu_to_le32(dir->i_ino & 0xffffffff);
+	ref.high = 0;
+#endif
+	ref.seq = dir_ni->mi.mrec->seq;
+
+	le = NULL;
+	fname = ni_fname_name(ni, uni, &ref, &le);
+	if (!fname) {
+		err = -ENOENT;
+		goto out3;
+	}
+
+	name_type = paired_name(fname->type);
+
+	err = indx_delete_entry(indx, dir_ni, fname, fname_full_size(fname),
+				sbi);
+	if (err)
+		goto out4;
+
+	/* Then remove name from mft */
+	ni_remove_attr_le(ni, attr_from_name(fname), le);
+
+	le16_add_cpu(&ni->mi.mrec->hard_links, -1);
+	ni->mi.dirty = true;
+
+	if (name_type != FILE_NAME_POSIX) {
+		/* Now we should delete name by type */
+		fname = ni_fname_type(ni, name_type, &le);
+		if (fname) {
+			err = indx_delete_entry(indx, dir_ni, fname,
+						fname_full_size(fname), sbi);
+			if (err)
+				goto out4;
+
+			ni_remove_attr_le(ni, attr_from_name(fname), le);
+
+			le16_add_cpu(&ni->mi.mrec->hard_links, -1);
+		}
+	}
+
+out4:
+	switch (err) {
+	case 0:
+		drop_nlink(inode);
+	case -ENOTEMPTY:
+	case -ENOSPC:
+	case -EROFS:
+		break;
+	default:
+		make_bad_inode(inode);
+	}
+
+	dir->i_mtime = dir->i_ctime = current_time(dir);
+	mark_inode_dirty(dir);
+	inode->i_ctime = dir->i_ctime;
+	if (inode->i_nlink)
+		mark_inode_dirty(inode);
+
+out3:
+	__putname(uni);
+out1:
+	ni_unlock(ni);
+	return err;
+}
+
+void ntfs_evict_inode(struct inode *inode)
+{
+	truncate_inode_pages_final(&inode->i_data);
+
+	if (inode->i_nlink)
+		_ni_write_inode(inode, inode_needs_sync(inode));
+
+	invalidate_inode_buffers(inode);
+	clear_inode(inode);
+
+	ni_clear(ntfs_i(inode));
+}
+
+static noinline int ntfs_readlink_hlp(struct inode *inode, char *buffer,
+				      int buflen)
+{
+	int i, err = 0;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	struct super_block *sb = inode->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	u64 i_size = inode->i_size;
+	u16 nlen = 0;
+	void *to_free = NULL;
+	struct REPARSE_DATA_BUFFER *rp;
+	struct le_str *uni;
+	struct ATTRIB *attr;
+
+	/* Reparse data present. Try to parse it */
+	static_assert(!offsetof(struct REPARSE_DATA_BUFFER, ReparseTag));
+	static_assert(sizeof(u32) == sizeof(rp->ReparseTag));
+
+	*buffer = 0;
+
+	/* Read into temporal buffer */
+	if (i_size > sbi->reparse.max_size || i_size <= sizeof(u32)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	attr = ni_find_attr(ni, NULL, NULL, ATTR_REPARSE, NULL, 0, NULL, NULL);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (!attr->non_res) {
+		rp = resident_data_ex(attr, i_size);
+		if (!rp) {
+			err = -EINVAL;
+			goto out;
+		}
+	} else {
+		rp = ntfs_malloc(i_size);
+		if (!rp) {
+			err = -ENOMEM;
+			goto out;
+		}
+		to_free = rp;
+		err = ntfs_read_run_nb(sbi, &ni->file.run, 0, rp, i_size, NULL);
+		if (err)
+			goto out;
+	}
+
+	err = -EINVAL;
+
+	/* Microsoft Tag */
+	switch (rp->ReparseTag) {
+	case IO_REPARSE_TAG_MOUNT_POINT:
+		/* Mount points and junctions */
+		/* Can we use 'Rp->MountPointReparseBuffer.PrintNameLength'? */
+		if (i_size <= offsetof(struct REPARSE_DATA_BUFFER,
+				       MountPointReparseBuffer.PathBuffer))
+			goto out;
+		uni = Add2Ptr(rp,
+			      offsetof(struct REPARSE_DATA_BUFFER,
+				       MountPointReparseBuffer.PathBuffer) +
+				      le16_to_cpu(rp->MountPointReparseBuffer
+							  .PrintNameOffset) -
+				      2);
+		nlen = le16_to_cpu(rp->MountPointReparseBuffer.PrintNameLength);
+		break;
+
+	case IO_REPARSE_TAG_SYMLINK:
+		/* FolderSymbolicLink */
+		/* Can we use 'Rp->SymbolicLinkReparseBuffer.PrintNameLength'? */
+		if (i_size <= offsetof(struct REPARSE_DATA_BUFFER,
+				       SymbolicLinkReparseBuffer.PathBuffer))
+			goto out;
+		uni = Add2Ptr(rp,
+			      offsetof(struct REPARSE_DATA_BUFFER,
+				       SymbolicLinkReparseBuffer.PathBuffer) +
+				      le16_to_cpu(rp->SymbolicLinkReparseBuffer
+							  .PrintNameOffset) -
+				      2);
+		nlen = le16_to_cpu(
+			rp->SymbolicLinkReparseBuffer.PrintNameLength);
+		break;
+
+	case IO_REPARSE_TAG_CLOUD:
+	case IO_REPARSE_TAG_CLOUD_1:
+	case IO_REPARSE_TAG_CLOUD_2:
+	case IO_REPARSE_TAG_CLOUD_3:
+	case IO_REPARSE_TAG_CLOUD_4:
+	case IO_REPARSE_TAG_CLOUD_5:
+	case IO_REPARSE_TAG_CLOUD_6:
+	case IO_REPARSE_TAG_CLOUD_7:
+	case IO_REPARSE_TAG_CLOUD_8:
+	case IO_REPARSE_TAG_CLOUD_9:
+	case IO_REPARSE_TAG_CLOUD_A:
+	case IO_REPARSE_TAG_CLOUD_B:
+	case IO_REPARSE_TAG_CLOUD_C:
+	case IO_REPARSE_TAG_CLOUD_D:
+	case IO_REPARSE_TAG_CLOUD_E:
+	case IO_REPARSE_TAG_CLOUD_F:
+		err = sizeof("OneDrive") - 1;
+		if (err > buflen)
+			err = buflen;
+		memcpy(buffer, "OneDrive", err);
+		goto out;
+
+	default:
+		if (IsReparseTagMicrosoft(rp->ReparseTag)) {
+			/* unknown Microsoft Tag */
+			goto out;
+		}
+		if (!IsReparseTagNameSurrogate(rp->ReparseTag) ||
+		    i_size <= sizeof(struct REPARSE_POINT)) {
+			goto out;
+		}
+
+		/* Users tag */
+		uni = Add2Ptr(rp, sizeof(struct REPARSE_POINT) - 2);
+		nlen = le16_to_cpu(rp->ReparseDataLength) -
+		       sizeof(struct REPARSE_POINT);
+	}
+
+	/* Convert nlen from bytes to UNICODE chars */
+	nlen >>= 1;
+
+	/* Check that name is available */
+	if (!nlen || &uni->name[nlen] > (__le16 *)Add2Ptr(rp, i_size))
+		goto out;
+
+	/* If name is already zero terminated then truncate it now */
+	if (!uni->name[nlen - 1])
+		nlen -= 1;
+	uni->len = nlen;
+
+	err = ntfs_utf16_to_nls(sbi, uni, buffer, buflen);
+
+	if (err < 0)
+		goto out;
+
+	/* translate windows '\' into linux '/' */
+	for (i = 0; i < err; i++) {
+		if (buffer[i] == '\\')
+			buffer[i] = '/';
+	}
+
+	/* Always set last zero */
+	buffer[err] = 0;
+out:
+	ntfs_free(to_free);
+	return err;
+}
+
+static const char *ntfs_get_link(struct dentry *de, struct inode *inode,
+				 struct delayed_call *done)
+{
+	int err;
+	char *ret;
+
+	if (!de)
+		return ERR_PTR(-ECHILD);
+
+	ret = kmalloc(PAGE_SIZE, GFP_NOFS);
+	if (!ret)
+		return ERR_PTR(-ENOMEM);
+
+	err = ntfs_readlink_hlp(inode, ret, PAGE_SIZE);
+	if (err < 0) {
+		kfree(ret);
+		return ERR_PTR(err);
+	}
+
+	set_delayed_call(done, kfree_link, ret);
+
+	return ret;
+}
+
+const struct inode_operations ntfs_link_inode_operations = {
+	.get_link = ntfs_get_link,
+	.setattr = ntfs3_setattr,
+	.listxattr = ntfs_listxattr,
+	.permission = ntfs_permission,
+	.get_acl = ntfs_get_acl,
+	.set_acl = ntfs_set_acl,
+};
+
+const struct address_space_operations ntfs_aops = {
+	.readpage = ntfs_readpage,
+	.readahead = ntfs_readahead,
+	.writepage = ntfs_writepage,
+	.writepages = ntfs_writepages,
+	.write_begin = ntfs_write_begin,
+	.write_end = ntfs_write_end,
+	.direct_IO = ntfs_direct_IO,
+	.bmap = ntfs_bmap,
+};
+
+const struct address_space_operations ntfs_aops_cmpr = {
+	.readpage = ntfs_readpage,
+	.readahead = ntfs_readahead,
+};
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
new file mode 100644
index 000000000000..9d9e597c307d
--- /dev/null
+++ b/fs/ntfs3/super.c
@@ -0,0 +1,1479 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ *
+ *                 terminology
+ *
+ * vcn - virtual cluster number - offset inside the file in clusters
+ * vbo - virtual byte offset    - offset inside the file in bytes
+ * lcn - logical cluster number - 0 based cluster in clusters heap
+ * lbo - logical byte offset    - absolute position inside volume
+ * run - maps vcn to lcn        - stored in attributes in packed form
+ * attr - attribute segment     - std/name/data etc records inside MFT
+ * mi  - mft inode              - one MFT record(usually 1024 bytes), consists of attributes
+ * ni  - ntfs inode             - extends linux inode. consists of one or more mft inodes
+ *
+ */
+
+#include <linux/backing-dev.h>
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/exportfs.h>
+#include <linux/fs.h>
+#include <linux/iversion.h>
+#include <linux/module.h>
+#include <linux/nls.h>
+#include <linux/parser.h>
+#include <linux/seq_file.h>
+#include <linux/statfs.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+#include "lib/lib.h"
+#endif
+
+#ifdef CONFIG_PRINTK
+/*
+ * Trace warnings/notices/errors
+ * Thanks Joe Perches <joe@perches.com> for implementation
+ */
+void ntfs_printk(const struct super_block *sb, const char *fmt, ...)
+{
+	struct va_format vaf;
+	va_list args;
+	int level;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+
+	/*should we use different ratelimits for warnings/notices/errors? */
+	if (!___ratelimit(&sbi->msg_ratelimit, "ntfs3"))
+		return;
+
+	va_start(args, fmt);
+
+	level = printk_get_level(fmt);
+	vaf.fmt = printk_skip_level(fmt);
+	vaf.va = &args;
+	printk("%c%cntfs3: %s: %pV\n", KERN_SOH_ASCII, level, sb->s_id, &vaf);
+
+	va_end(args);
+}
+
+static char s_name_buf[512];
+static atomic_t s_name_buf_cnt = ATOMIC_INIT(1); // 1 means 'free s_name_buf'
+
+/* print warnings/notices/errors about inode using name or inode number */
+void ntfs_inode_printk(struct inode *inode, const char *fmt, ...)
+{
+	struct super_block *sb = inode->i_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	char *name;
+	va_list args;
+	struct va_format vaf;
+	int level;
+
+	if (!___ratelimit(&sbi->msg_ratelimit, "ntfs3"))
+		return;
+
+	if (atomic_dec_and_test(&s_name_buf_cnt)) {
+		/* use static allocated buffer */
+		name = s_name_buf;
+	} else {
+		name = kmalloc(sizeof(s_name_buf), GFP_NOFS);
+	}
+
+	if (name) {
+		struct dentry *dentry = d_find_alias(inode);
+		const u32 name_len = ARRAY_SIZE(s_name_buf) - 1;
+
+		if (dentry) {
+			spin_lock(&dentry->d_lock);
+			snprintf(name, name_len, "%s", dentry->d_name.name);
+			spin_unlock(&dentry->d_lock);
+			dput(dentry);
+			name[name_len] = 0; /* to be sure*/
+		} else {
+			name[0] = 0;
+		}
+	}
+
+	va_start(args, fmt);
+
+	level = printk_get_level(fmt);
+	vaf.fmt = printk_skip_level(fmt);
+	vaf.va = &args;
+
+	printk("%c%cntfs3: %s: ino=%lx, \"%s\" %pV\n", KERN_SOH_ASCII, level,
+	       sb->s_id, inode->i_ino, name ? name : "", &vaf);
+
+	va_end(args);
+
+	atomic_inc(&s_name_buf_cnt);
+	if (name != s_name_buf)
+		kfree(name);
+}
+#endif
+
+/*
+ * Shared memory struct.
+ *
+ * on-disk ntfs's upcase table is created by ntfs formater
+ * 'upcase' table is 128K bytes of memory
+ * we should read it into memory when mounting
+ * Several ntfs volumes likely use the same 'upcase' table
+ * It is good idea to share in-memory 'upcase' table between different volumes
+ * Unfortunately winxp/vista/win7 use different upcase tables
+ */
+static DEFINE_SPINLOCK(s_shared_lock);
+
+static struct {
+	void *ptr;
+	u32 len;
+	int cnt;
+} s_shared[8];
+
+/*
+ * ntfs_set_shared
+ *
+ * Returns 'ptr' if pointer was saved in shared memory
+ * Returns NULL if pointer was not shared
+ */
+void *ntfs_set_shared(void *ptr, u32 bytes)
+{
+	void *ret = NULL;
+	int i, j = -1;
+
+	spin_lock(&s_shared_lock);
+	for (i = 0; i < ARRAY_SIZE(s_shared); i++) {
+		if (!s_shared[i].cnt) {
+			j = i;
+		} else if (bytes == s_shared[i].len &&
+			   !memcmp(s_shared[i].ptr, ptr, bytes)) {
+			s_shared[i].cnt += 1;
+			ret = s_shared[i].ptr;
+			break;
+		}
+	}
+
+	if (!ret && j != -1) {
+		s_shared[j].ptr = ptr;
+		s_shared[j].len = bytes;
+		s_shared[j].cnt = 1;
+		ret = ptr;
+	}
+	spin_unlock(&s_shared_lock);
+
+	return ret;
+}
+
+/*
+ * ntfs_put_shared
+ *
+ * Returns 'ptr' if pointer is not shared anymore
+ * Returns NULL if pointer is still shared
+ */
+void *ntfs_put_shared(void *ptr)
+{
+	void *ret = ptr;
+	int i;
+
+	spin_lock(&s_shared_lock);
+	for (i = 0; i < ARRAY_SIZE(s_shared); i++) {
+		if (s_shared[i].cnt && s_shared[i].ptr == ptr) {
+			if (--s_shared[i].cnt)
+				ret = NULL;
+			break;
+		}
+	}
+	spin_unlock(&s_shared_lock);
+
+	return ret;
+}
+
+static inline void clear_mount_options(struct ntfs_mount_options *options)
+{
+	unload_nls(options->nls);
+}
+
+enum Opt {
+	Opt_uid,
+	Opt_gid,
+	Opt_umask,
+	Opt_dmask,
+	Opt_fmask,
+	Opt_immutable,
+	Opt_discard,
+	Opt_force,
+	Opt_sparse,
+	Opt_nohidden,
+	Opt_showmeta,
+	Opt_acl,
+	Opt_noatime,
+	Opt_nls,
+	Opt_prealloc,
+	Opt_no_acs_rules,
+	Opt_err,
+};
+
+static const match_table_t ntfs_tokens = {
+	{ Opt_uid, "uid=%u" },
+	{ Opt_gid, "gid=%u" },
+	{ Opt_umask, "umask=%o" },
+	{ Opt_dmask, "dmask=%o" },
+	{ Opt_fmask, "fmask=%o" },
+	{ Opt_immutable, "sys_immutable" },
+	{ Opt_discard, "discard" },
+	{ Opt_force, "force" },
+	{ Opt_sparse, "sparse" },
+	{ Opt_nohidden, "nohidden" },
+	{ Opt_acl, "acl" },
+	{ Opt_noatime, "noatime" },
+	{ Opt_showmeta, "showmeta" },
+	{ Opt_nls, "nls=%s" },
+	{ Opt_prealloc, "prealloc" },
+	{ Opt_no_acs_rules, "no_acs_rules" },
+	{ Opt_err, NULL },
+};
+
+static noinline int ntfs_parse_options(struct super_block *sb, char *options,
+				       int silent,
+				       struct ntfs_mount_options *opts)
+{
+	char *p;
+	substring_t args[MAX_OPT_ARGS];
+	int option;
+	char nls_name[30];
+	struct nls_table *nls;
+
+	opts->fs_uid = current_uid();
+	opts->fs_gid = current_gid();
+	opts->fs_fmask_inv = opts->fs_dmask_inv = ~current_umask();
+	nls_name[0] = 0;
+
+	if (!options)
+		goto out;
+
+	while ((p = strsep(&options, ","))) {
+		int token;
+
+		if (!*p)
+			continue;
+
+		token = match_token(p, ntfs_tokens, args);
+		switch (token) {
+		case Opt_immutable:
+			opts->sys_immutable = 1;
+			break;
+		case Opt_uid:
+			if (match_int(&args[0], &option))
+				return -EINVAL;
+			opts->fs_uid = make_kuid(current_user_ns(), option);
+			if (!uid_valid(opts->fs_uid))
+				return -EINVAL;
+			opts->uid = 1;
+			break;
+		case Opt_gid:
+			if (match_int(&args[0], &option))
+				return -EINVAL;
+			opts->fs_gid = make_kgid(current_user_ns(), option);
+			if (!gid_valid(opts->fs_gid))
+				return -EINVAL;
+			opts->gid = 1;
+			break;
+		case Opt_umask:
+			if (match_octal(&args[0], &option))
+				return -EINVAL;
+			opts->fs_fmask_inv = opts->fs_dmask_inv = ~option;
+			opts->fmask = opts->dmask = 1;
+			break;
+		case Opt_dmask:
+			if (match_octal(&args[0], &option))
+				return -EINVAL;
+			opts->fs_dmask_inv = ~option;
+			opts->dmask = 1;
+			break;
+		case Opt_fmask:
+			if (match_octal(&args[0], &option))
+				return -EINVAL;
+			opts->fs_fmask_inv = ~option;
+			opts->fmask = 1;
+			break;
+		case Opt_discard:
+			opts->discard = 1;
+			break;
+		case Opt_force:
+			opts->force = 1;
+			break;
+		case Opt_sparse:
+			opts->sparse = 1;
+			break;
+		case Opt_nohidden:
+			opts->nohidden = 1;
+			break;
+		case Opt_acl:
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+			sb->s_flags |= SB_POSIXACL;
+			break;
+#else
+			ntfs_err(sb, "support for ACL not compiled in!");
+			return -EINVAL;
+#endif
+		case Opt_noatime:
+			sb->s_flags |= SB_NOATIME;
+			break;
+		case Opt_showmeta:
+			opts->showmeta = 1;
+			break;
+		case Opt_nls:
+			match_strlcpy(nls_name, &args[0], sizeof(nls_name));
+			break;
+		case Opt_prealloc:
+			opts->prealloc = 1;
+			break;
+		case Opt_no_acs_rules:
+			opts->no_acs_rules = 1;
+			break;
+		default:
+			if (!silent)
+				ntfs_err(
+					sb,
+					"Unrecognized mount option \"%s\" or missing value",
+					p);
+			//return -EINVAL;
+		}
+	}
+
+out:
+	if (!strcmp(nls_name[0] ? nls_name : CONFIG_NLS_DEFAULT, "utf8")) {
+		/* For UTF-8 use utf16s_to_utf8s/utf8s_to_utf16s instead of nls */
+		nls = NULL;
+	} else if (nls_name[0]) {
+		nls = load_nls(nls_name);
+		if (!nls) {
+			ntfs_err(sb, "failed to load \"%s\"", nls_name);
+			return -EINVAL;
+		}
+	} else {
+		nls = load_nls_default();
+		if (!nls) {
+			ntfs_err(sb, "failed to load default nls");
+			return -EINVAL;
+		}
+	}
+	opts->nls = nls;
+
+	return 0;
+}
+
+static int ntfs_remount(struct super_block *sb, int *flags, char *data)
+{
+	int err, ro_rw;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct ntfs_mount_options old_opts;
+	char *orig_data = kstrdup(data, GFP_KERNEL);
+
+	if (data && !orig_data)
+		return -ENOMEM;
+
+	/* Store  original options */
+	memcpy(&old_opts, &sbi->options, sizeof(old_opts));
+	clear_mount_options(&sbi->options);
+	memset(&sbi->options, 0, sizeof(sbi->options));
+
+	err = ntfs_parse_options(sb, data, 0, &sbi->options);
+	if (err)
+		goto restore_opts;
+
+	ro_rw = 0;
+	if (sb_rdonly(sb) && !(*flags & SB_RDONLY)) {
+		/* ro -> rw */
+		ro_rw = 1;
+		if (sbi->flags & NTFS_FLAGS_NEED_REPLAY) {
+			ntfs_warn(
+				sb,
+				"Couldn't remount rw because journal is not replayed. Please umount/remount instead\n");
+			err = -EINVAL;
+			goto restore_opts;
+		}
+	}
+
+	sync_filesystem(sb);
+
+	if (ro_rw && (sbi->volume.flags & VOLUME_FLAG_DIRTY) &&
+	    !sbi->options.force) {
+		ntfs_warn(sb, "volume is dirty and \"force\" flag is not set!");
+		err = -EINVAL;
+		goto restore_opts;
+	}
+
+	clear_mount_options(&old_opts);
+
+	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME) |
+		 SB_NODIRATIME | SB_NOATIME;
+	ntfs_info(sb, "re-mounted. Opts: %s", orig_data);
+	err = 0;
+	goto out;
+
+restore_opts:
+	clear_mount_options(&sbi->options);
+	memcpy(&sbi->options, &old_opts, sizeof(old_opts));
+
+out:
+	kfree(orig_data);
+	return err;
+}
+
+static struct kmem_cache *ntfs_inode_cachep;
+
+static struct inode *ntfs_alloc_inode(struct super_block *sb)
+{
+	struct ntfs_inode *ni = kmem_cache_alloc(ntfs_inode_cachep, GFP_NOFS);
+
+	if (!ni)
+		return NULL;
+
+	memset(ni, 0, offsetof(struct ntfs_inode, vfs_inode));
+
+	mutex_init(&ni->ni_lock);
+
+	return &ni->vfs_inode;
+}
+
+static void ntfs_i_callback(struct rcu_head *head)
+{
+	struct inode *inode = container_of(head, struct inode, i_rcu);
+	struct ntfs_inode *ni = ntfs_i(inode);
+
+	mutex_destroy(&ni->ni_lock);
+
+	kmem_cache_free(ntfs_inode_cachep, ni);
+}
+
+static void ntfs_destroy_inode(struct inode *inode)
+{
+	call_rcu(&inode->i_rcu, ntfs_i_callback);
+}
+
+static void init_once(void *foo)
+{
+	struct ntfs_inode *ni = foo;
+
+	inode_init_once(&ni->vfs_inode);
+}
+
+/* noinline to reduce binary size*/
+static noinline void put_ntfs(struct ntfs_sb_info *sbi)
+{
+	ntfs_free(sbi->new_rec);
+	ntfs_free(ntfs_put_shared(sbi->upcase));
+	ntfs_free(sbi->def_table);
+
+	wnd_close(&sbi->mft.bitmap);
+	wnd_close(&sbi->used.bitmap);
+
+	if (sbi->mft.ni)
+		iput(&sbi->mft.ni->vfs_inode);
+
+	if (sbi->security.ni)
+		iput(&sbi->security.ni->vfs_inode);
+
+	if (sbi->reparse.ni)
+		iput(&sbi->reparse.ni->vfs_inode);
+
+	if (sbi->objid.ni)
+		iput(&sbi->objid.ni->vfs_inode);
+
+	if (sbi->volume.ni)
+		iput(&sbi->volume.ni->vfs_inode);
+
+	ntfs_update_mftmirr(sbi, 0);
+
+	indx_clear(&sbi->security.index_sii);
+	indx_clear(&sbi->security.index_sdh);
+	indx_clear(&sbi->reparse.index_r);
+	indx_clear(&sbi->objid.index_o);
+	ntfs_free(sbi->compress.lznt);
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+	xpress_free_decompressor(sbi->compress.xpress);
+	lzx_free_decompressor(sbi->compress.lzx);
+#endif
+	clear_mount_options(&sbi->options);
+
+	ntfs_free(sbi);
+}
+
+static void ntfs_put_super(struct super_block *sb)
+{
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+
+	/*mark rw ntfs as clear, if possible*/
+	ntfs_set_state(sbi, NTFS_DIRTY_CLEAR);
+
+	put_ntfs(sbi);
+
+	sync_blockdev(sb->s_bdev);
+}
+
+static int ntfs_statfs(struct dentry *dentry, struct kstatfs *buf)
+{
+	struct super_block *sb = dentry->d_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct wnd_bitmap *wnd = &sbi->used.bitmap;
+
+	buf->f_type = sb->s_magic;
+	buf->f_bsize = sbi->cluster_size;
+	buf->f_blocks = wnd->nbits;
+
+	buf->f_bfree = buf->f_bavail = wnd_zeroes(wnd);
+	buf->f_fsid.val[0] = sbi->volume.ser_num;
+	buf->f_fsid.val[1] = (sbi->volume.ser_num >> 32);
+	buf->f_namelen = NTFS_NAME_LEN;
+
+	return 0;
+}
+
+static int ntfs_show_options(struct seq_file *m, struct dentry *root)
+{
+	struct super_block *sb = root->d_sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct ntfs_mount_options *opts = &sbi->options;
+
+	if (opts->uid)
+		seq_printf(m, ",uid=%u",
+			   from_kuid_munged(&init_user_ns, opts->fs_uid));
+	if (opts->gid)
+		seq_printf(m, ",gid=%u",
+			   from_kgid_munged(&init_user_ns, opts->fs_gid));
+	if (opts->fmask)
+		seq_printf(m, ",fmask=%04o", ~opts->fs_fmask_inv);
+	if (opts->dmask)
+		seq_printf(m, ",dmask=%04o", ~opts->fs_dmask_inv);
+	if (opts->nls)
+		seq_printf(m, ",nls=%s", opts->nls->charset);
+	else
+		seq_puts(m, ",nls=utf8");
+	if (opts->sys_immutable)
+		seq_puts(m, ",sys_immutable");
+	if (opts->discard)
+		seq_puts(m, ",discard");
+	if (opts->sparse)
+		seq_puts(m, ",sparse");
+	if (opts->showmeta)
+		seq_puts(m, ",showmeta");
+	if (opts->nohidden)
+		seq_puts(m, ",nohidden");
+	if (opts->force)
+		seq_puts(m, ",force");
+	if (opts->no_acs_rules)
+		seq_puts(m, ",no_acs_rules");
+	if (opts->prealloc)
+		seq_puts(m, ",prealloc");
+	if (sb->s_flags & SB_POSIXACL)
+		seq_puts(m, ",acl");
+	if (sb->s_flags & SB_NOATIME)
+		seq_puts(m, ",noatime");
+
+	return 0;
+}
+
+/*super_operations::sync_fs*/
+static int ntfs_sync_fs(struct super_block *sb, int wait)
+{
+	int err = 0, err2;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct ntfs_inode *ni;
+	struct inode *inode;
+
+	ni = sbi->security.ni;
+	if (ni) {
+		inode = &ni->vfs_inode;
+		err2 = _ni_write_inode(inode, wait);
+		if (err2 && !err)
+			err = err2;
+	}
+
+	ni = sbi->objid.ni;
+	if (ni) {
+		inode = &ni->vfs_inode;
+		err2 = _ni_write_inode(inode, wait);
+		if (err2 && !err)
+			err = err2;
+	}
+
+	ni = sbi->reparse.ni;
+	if (ni) {
+		inode = &ni->vfs_inode;
+		err2 = _ni_write_inode(inode, wait);
+		if (err2 && !err)
+			err = err2;
+	}
+
+	if (!err)
+		ntfs_set_state(sbi, NTFS_DIRTY_CLEAR);
+
+	ntfs_update_mftmirr(sbi, wait);
+
+	return err;
+}
+
+static const struct super_operations ntfs_sops = {
+	.alloc_inode = ntfs_alloc_inode,
+	.destroy_inode = ntfs_destroy_inode,
+	.evict_inode = ntfs_evict_inode,
+	.put_super = ntfs_put_super,
+	.statfs = ntfs_statfs,
+	.show_options = ntfs_show_options,
+	.sync_fs = ntfs_sync_fs,
+	.remount_fs = ntfs_remount,
+	.write_inode = ntfs3_write_inode,
+};
+
+static struct inode *ntfs_export_get_inode(struct super_block *sb, u64 ino,
+					   u32 generation)
+{
+	struct MFT_REF ref;
+	struct inode *inode;
+
+	ref.low = cpu_to_le32(ino);
+#ifdef NTFS3_64BIT_CLUSTER
+	ref.high = cpu_to_le16(ino >> 32);
+#else
+	ref.high = 0;
+#endif
+	ref.seq = cpu_to_le16(generation);
+
+	inode = ntfs_iget5(sb, &ref, NULL);
+	if (!IS_ERR(inode) && is_bad_inode(inode)) {
+		iput(inode);
+		inode = ERR_PTR(-ESTALE);
+	}
+
+	return inode;
+}
+
+static struct dentry *ntfs_fh_to_dentry(struct super_block *sb, struct fid *fid,
+					int fh_len, int fh_type)
+{
+	return generic_fh_to_dentry(sb, fid, fh_len, fh_type,
+				    ntfs_export_get_inode);
+}
+
+static struct dentry *ntfs_fh_to_parent(struct super_block *sb, struct fid *fid,
+					int fh_len, int fh_type)
+{
+	return generic_fh_to_parent(sb, fid, fh_len, fh_type,
+				    ntfs_export_get_inode);
+}
+
+/* TODO: == ntfs_sync_inode */
+static int ntfs_nfs_commit_metadata(struct inode *inode)
+{
+	return _ni_write_inode(inode, 1);
+}
+
+static const struct export_operations ntfs_export_ops = {
+	.fh_to_dentry = ntfs_fh_to_dentry,
+	.fh_to_parent = ntfs_fh_to_parent,
+	.get_parent = ntfs3_get_parent,
+	.commit_metadata = ntfs_nfs_commit_metadata,
+};
+
+/* Returns Gb,Mb to print with "%u.%02u Gb" */
+static u32 format_size_gb(const u64 bytes, u32 *mb)
+{
+	/* Do simple right 30 bit shift of 64 bit value */
+	u64 kbytes = bytes >> 10;
+	u32 kbytes32 = kbytes;
+
+	*mb = (100 * (kbytes32 & 0xfffff) + 0x7ffff) >> 20;
+	if (*mb >= 100)
+		*mb = 99;
+
+	return (kbytes32 >> 20) | (((u32)(kbytes >> 32)) << 12);
+}
+
+static u32 true_sectors_per_clst(const struct NTFS_BOOT *boot)
+{
+	return boot->sectors_per_clusters <= 0x80 ?
+		       boot->sectors_per_clusters :
+		       (1u << (0 - boot->sectors_per_clusters));
+}
+
+/* inits internal info from on-disk boot sector*/
+static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+			       u64 dev_size)
+{
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	int err;
+	u32 mb, gb, boot_sector_size, sct_per_clst, record_size;
+	u64 sectors, clusters, fs_size, mlcn, mlcn2;
+	struct NTFS_BOOT *boot;
+	struct buffer_head *bh;
+	struct MFT_REC *rec;
+	u16 fn, ao;
+
+	sbi->volume.blocks = dev_size >> PAGE_SHIFT;
+
+	bh = ntfs_bread(sb, 0);
+	if (!bh)
+		return -EIO;
+
+	err = -EINVAL;
+	boot = (struct NTFS_BOOT *)bh->b_data;
+
+	if (memcmp(boot->system_id, "NTFS    ", sizeof("NTFS    ") - 1))
+		goto out;
+
+	/* 0x55AA is not mandaroty. Thanks Maxim Suhanov*/
+	/*if (0x55 != boot->boot_magic[0] || 0xAA != boot->boot_magic[1])
+	 *	goto out;
+	 */
+
+	boot_sector_size = (u32)boot->bytes_per_sector[1] << 8;
+	if (boot->bytes_per_sector[0] || boot_sector_size < SECTOR_SIZE ||
+	    !is_power_of2(boot_sector_size)) {
+		goto out;
+	}
+
+	/* cluster size: 512, 1K, 2K, 4K, ... 2M */
+	sct_per_clst = true_sectors_per_clst(boot);
+	if (!is_power_of2(sct_per_clst))
+		goto out;
+
+	mlcn = le64_to_cpu(boot->mft_clst);
+	mlcn2 = le64_to_cpu(boot->mft2_clst);
+	sectors = le64_to_cpu(boot->sectors_per_volume);
+
+	if (mlcn * sct_per_clst >= sectors)
+		goto out;
+
+	if (mlcn2 * sct_per_clst >= sectors)
+		goto out;
+
+	/* Check MFT record size */
+	if ((boot->record_size < 0 &&
+	     SECTOR_SIZE > (2U << (-boot->record_size))) ||
+	    (boot->record_size >= 0 && !is_power_of2(boot->record_size))) {
+		goto out;
+	}
+
+	/* Check index record size */
+	if ((boot->index_size < 0 &&
+	     SECTOR_SIZE > (2U << (-boot->index_size))) ||
+	    (boot->index_size >= 0 && !is_power_of2(boot->index_size))) {
+		goto out;
+	}
+
+	sbi->sector_size = boot_sector_size;
+	sbi->sector_bits = blksize_bits(boot_sector_size);
+	fs_size = (sectors + 1) << sbi->sector_bits;
+
+	gb = format_size_gb(fs_size, &mb);
+
+	/*
+	 * - Volume formatted and mounted with the same sector size
+	 * - Volume formatted 4K and mounted as 512
+	 * - Volume formatted 512 and mounted as 4K
+	 */
+	if (sbi->sector_size != sector_size) {
+		ntfs_warn(sb,
+			  "Different NTFS' sector size and media sector size");
+		dev_size += sector_size - 1;
+	}
+
+	sbi->cluster_size = boot_sector_size * sct_per_clst;
+	sbi->cluster_bits = blksize_bits(sbi->cluster_size);
+
+	sbi->mft.lbo = mlcn << sbi->cluster_bits;
+	sbi->mft.lbo2 = mlcn2 << sbi->cluster_bits;
+
+	if (sbi->cluster_size < sbi->sector_size)
+		goto out;
+
+	sbi->cluster_mask = sbi->cluster_size - 1;
+	sbi->cluster_mask_inv = ~(u64)sbi->cluster_mask;
+	sbi->record_size = record_size = boot->record_size < 0 ?
+						 1 << (-boot->record_size) :
+						 (u32)boot->record_size
+							 << sbi->cluster_bits;
+
+	if (record_size > MAXIMUM_BYTES_PER_MFT)
+		goto out;
+
+	sbi->record_bits = blksize_bits(record_size);
+	sbi->attr_size_tr = (5 * record_size >> 4); // ~320 bytes
+
+	sbi->max_bytes_per_attr =
+		record_size - QuadAlign(MFTRECORD_FIXUP_OFFSET_1) -
+		QuadAlign(((record_size >> SECTOR_SHIFT) * sizeof(short))) -
+		QuadAlign(sizeof(enum ATTR_TYPE));
+
+	sbi->index_size = boot->index_size < 0 ?
+				  1u << (-boot->index_size) :
+				  (u32)boot->index_size << sbi->cluster_bits;
+
+	sbi->volume.ser_num = le64_to_cpu(boot->serial_num);
+	sbi->volume.size = sectors << sbi->sector_bits;
+
+	/* warning if RAW volume */
+	if (dev_size < fs_size) {
+		u32 mb0, gb0;
+
+		gb0 = format_size_gb(dev_size, &mb0);
+		ntfs_warn(
+			sb,
+			"RAW NTFS volume: Filesystem size %u.%02u Gb > volume size %u.%02u Gb. Mount in read-only",
+			gb, mb, gb0, mb0);
+		sb->s_flags |= SB_RDONLY;
+	}
+
+	clusters = sbi->volume.size >> sbi->cluster_bits;
+#ifdef NTFS3_64BIT_CLUSTER
+#if BITS_PER_LONG < 64
+#error "NTFS3_64BIT_CLUSTER incompatible in 32 bit OS"
+#endif
+#else
+	/* 32 bits per cluster */
+	if (clusters >> 32) {
+		ntfs_notice(
+			sb,
+			"NTFS %u.%02u Gb is too big to use 32 bits per cluster",
+			gb, mb);
+		goto out;
+	}
+#endif
+
+	sbi->used.bitmap.nbits = clusters;
+
+	rec = ntfs_zalloc(record_size);
+	if (!rec) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	sbi->new_rec = rec;
+	rec->rhdr.sign = NTFS_FILE_SIGNATURE;
+	rec->rhdr.fix_off = cpu_to_le16(MFTRECORD_FIXUP_OFFSET_1);
+	fn = (sbi->record_size >> SECTOR_SHIFT) + 1;
+	rec->rhdr.fix_num = cpu_to_le16(fn);
+	ao = QuadAlign(MFTRECORD_FIXUP_OFFSET_1 + sizeof(short) * fn);
+	rec->attr_off = cpu_to_le16(ao);
+	rec->used = cpu_to_le32(ao + QuadAlign(sizeof(enum ATTR_TYPE)));
+	rec->total = cpu_to_le32(sbi->record_size);
+	((struct ATTRIB *)Add2Ptr(rec, ao))->type = ATTR_END;
+
+	if (sbi->cluster_size < PAGE_SIZE)
+		sb_set_blocksize(sb, sbi->cluster_size);
+
+	sbi->block_mask = sb->s_blocksize - 1;
+	sbi->blocks_per_cluster = sbi->cluster_size >> sb->s_blocksize_bits;
+	sbi->volume.blocks = sbi->volume.size >> sb->s_blocksize_bits;
+
+	/* Maximum size for normal files */
+	sbi->maxbytes = (clusters << sbi->cluster_bits) - 1;
+
+#ifdef NTFS3_64BIT_CLUSTER
+	if (clusters >= (1ull << (64 - sbi->cluster_bits)))
+		sbi->maxbytes = -1;
+	sbi->maxbytes_sparse = -1;
+#else
+	/* Maximum size for sparse file */
+	sbi->maxbytes_sparse = (1ull << (sbi->cluster_bits + 32)) - 1;
+#endif
+
+	err = 0;
+
+out:
+	brelse(bh);
+
+	return err;
+}
+
+/* try to mount*/
+static int ntfs_fill_super(struct super_block *sb, void *data, int silent)
+{
+	int err;
+	struct ntfs_sb_info *sbi;
+	struct block_device *bdev = sb->s_bdev;
+	struct inode *bd_inode = bdev->bd_inode;
+	struct request_queue *rq = bdev_get_queue(bdev);
+	struct inode *inode = NULL;
+	struct ntfs_inode *ni;
+	size_t i, tt;
+	CLST vcn, lcn, len;
+	struct ATTRIB *attr;
+	const struct VOLUME_INFO *info;
+	u32 idx, done, bytes;
+	struct ATTR_DEF_ENTRY *t;
+	u16 *upcase = NULL;
+	u16 *shared;
+	bool is_ro;
+	struct MFT_REF ref;
+
+	ref.high = 0;
+
+	sbi = ntfs_zalloc(sizeof(struct ntfs_sb_info));
+	if (!sbi)
+		return -ENOMEM;
+
+	sb->s_fs_info = sbi;
+	sbi->sb = sb;
+	sb->s_flags |= SB_NODIRATIME;
+	sb->s_magic = 0x7366746e; // "ntfs"
+	sb->s_op = &ntfs_sops;
+	sb->s_export_op = &ntfs_export_ops;
+	sb->s_time_gran = NTFS_TIME_GRAN; // 100 nsec
+	sb->s_xattr = ntfs_xattr_handlers;
+	sb->s_maxbytes = MAX_LFS_FILESIZE;
+
+	ratelimit_state_init(&sbi->msg_ratelimit, DEFAULT_RATELIMIT_INTERVAL,
+			     DEFAULT_RATELIMIT_BURST);
+
+	err = ntfs_parse_options(sb, data, silent, &sbi->options);
+	if (err)
+		goto out;
+
+	if (!rq || !blk_queue_discard(rq) || !rq->limits.discard_granularity) {
+		;
+	} else {
+		sbi->discard_granularity = rq->limits.discard_granularity;
+		sbi->discard_granularity_mask_inv =
+			~(u64)(sbi->discard_granularity - 1);
+	}
+
+	sb_set_blocksize(sb, PAGE_SIZE);
+
+	/* parse boot */
+	err = ntfs_init_from_boot(sb, rq ? queue_logical_block_size(rq) : 512,
+				  bd_inode->i_size);
+	if (err)
+		goto out;
+
+	mutex_init(&sbi->compress.mtx_lznt);
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+	mutex_init(&sbi->compress.mtx_xpress);
+	mutex_init(&sbi->compress.mtx_lzx);
+#endif
+
+	/*
+	 * Load $Volume. This should be done before $LogFile
+	 * 'cause 'sbi->volume.ni' is used 'ntfs_set_state'
+	 */
+	ref.low = cpu_to_le32(MFT_REC_VOL);
+	ref.seq = cpu_to_le16(MFT_REC_VOL);
+	inode = ntfs_iget5(sb, &ref, &NAME_VOLUME);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $Volume.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	/* Load and save label (not necessary) */
+	attr = ni_find_attr(ni, NULL, NULL, ATTR_LABEL, NULL, 0, NULL, NULL);
+
+	if (!attr) {
+		/* It is ok if no ATTR_LABEL */
+	} else if (!attr->non_res && !is_attr_ext(attr)) {
+		/* $AttrDef allows labels to be up to 128 symbols */
+		err = utf16s_to_utf8s(resident_data(attr),
+				      le32_to_cpu(attr->res.data_size) >> 1,
+				      UTF16_LITTLE_ENDIAN, sbi->volume.label,
+				      sizeof(sbi->volume.label));
+		if (err < 0)
+			sbi->volume.label[0] = 0;
+	} else {
+		/* should we break mounting here? */
+		//err = -EINVAL;
+		//goto out;
+	}
+
+	attr = ni_find_attr(ni, attr, NULL, ATTR_VOL_INFO, NULL, 0, NULL, NULL);
+	if (!attr || is_attr_ext(attr)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	info = resident_data_ex(attr, SIZEOF_ATTRIBUTE_VOLUME_INFO);
+	if (!info) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	sbi->volume.major_ver = info->major_ver;
+	sbi->volume.minor_ver = info->minor_ver;
+	sbi->volume.flags = info->flags;
+
+	sbi->volume.ni = ni;
+	inode = NULL;
+
+	/* Load $MFTMirr to estimate recs_mirr */
+	ref.low = cpu_to_le32(MFT_REC_MIRR);
+	ref.seq = cpu_to_le16(MFT_REC_MIRR);
+	inode = ntfs_iget5(sb, &ref, &NAME_MIRROR);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $MFTMirr.");
+		inode = NULL;
+		goto out;
+	}
+
+	sbi->mft.recs_mirr =
+		ntfs_up_cluster(sbi, inode->i_size) >> sbi->record_bits;
+
+	iput(inode);
+
+	/* Load $LogFile to replay */
+	ref.low = cpu_to_le32(MFT_REC_LOG);
+	ref.seq = cpu_to_le16(MFT_REC_LOG);
+	inode = ntfs_iget5(sb, &ref, &NAME_LOGFILE);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $LogFile.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	err = ntfs_loadlog_and_replay(ni, sbi);
+	if (err)
+		goto out;
+
+	iput(inode);
+	inode = NULL;
+
+	is_ro = sb_rdonly(sbi->sb);
+
+	if (sbi->flags & NTFS_FLAGS_NEED_REPLAY) {
+		if (!is_ro) {
+			ntfs_warn(sb,
+				  "failed to replay log file. Can't mount rw!");
+			err = -EINVAL;
+			goto out;
+		}
+	} else if (sbi->volume.flags & VOLUME_FLAG_DIRTY) {
+		if (!is_ro && !sbi->options.force) {
+			ntfs_warn(
+				sb,
+				"volume is dirty and \"force\" flag is not set!");
+			err = -EINVAL;
+			goto out;
+		}
+	}
+
+	/* Load $MFT */
+	ref.low = cpu_to_le32(MFT_REC_MFT);
+	ref.seq = cpu_to_le16(1);
+
+	inode = ntfs_iget5(sb, &ref, &NAME_MFT);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $MFT.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	sbi->mft.used = ni->i_valid >> sbi->record_bits;
+	tt = inode->i_size >> sbi->record_bits;
+	sbi->mft.next_free = MFT_REC_USER;
+
+	err = wnd_init(&sbi->mft.bitmap, sb, tt);
+	if (err)
+		goto out;
+
+	err = ni_load_all_mi(ni);
+	if (err)
+		goto out;
+
+	sbi->mft.ni = ni;
+
+	/* Load $BadClus */
+	ref.low = cpu_to_le32(MFT_REC_BADCLUST);
+	ref.seq = cpu_to_le16(MFT_REC_BADCLUST);
+	inode = ntfs_iget5(sb, &ref, &NAME_BADCLUS);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $BadClus.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	for (i = 0; run_get_entry(&ni->file.run, i, &vcn, &lcn, &len); i++) {
+		if (lcn == SPARSE_LCN)
+			continue;
+
+		if (!sbi->bad_clusters)
+			ntfs_notice(sb, "Volume contains bad blocks");
+
+		sbi->bad_clusters += len;
+	}
+
+	iput(inode);
+
+	/* Load $Bitmap */
+	ref.low = cpu_to_le32(MFT_REC_BITMAP);
+	ref.seq = cpu_to_le16(MFT_REC_BITMAP);
+	inode = ntfs_iget5(sb, &ref, &NAME_BITMAP);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $Bitmap.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+#ifndef NTFS3_64BIT_CLUSTER
+	if (inode->i_size >> 32) {
+		err = -EINVAL;
+		goto out;
+	}
+#endif
+
+	/* Check bitmap boundary */
+	tt = sbi->used.bitmap.nbits;
+	if (inode->i_size < bitmap_size(tt)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Not necessary */
+	sbi->used.bitmap.set_tail = true;
+	err = wnd_init(&sbi->used.bitmap, sbi->sb, tt);
+	if (err)
+		goto out;
+
+	iput(inode);
+
+	/* Compute the mft zone */
+	err = ntfs_refresh_zone(sbi);
+	if (err)
+		goto out;
+
+	/* Load $AttrDef */
+	ref.low = cpu_to_le32(MFT_REC_ATTR);
+	ref.seq = cpu_to_le16(MFT_REC_ATTR);
+	inode = ntfs_iget5(sbi->sb, &ref, &NAME_ATTRDEF);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $AttrDef -> %d", err);
+		inode = NULL;
+		goto out;
+	}
+
+	if (inode->i_size < sizeof(struct ATTR_DEF_ENTRY)) {
+		err = -EINVAL;
+		goto out;
+	}
+	bytes = inode->i_size;
+	sbi->def_table = t = ntfs_malloc(bytes);
+	if (!t) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	for (done = idx = 0; done < bytes; done += PAGE_SIZE, idx++) {
+		unsigned long tail = bytes - done;
+		struct page *page = ntfs_map_page(inode->i_mapping, idx);
+
+		if (IS_ERR(page)) {
+			err = PTR_ERR(page);
+			goto out;
+		}
+		memcpy(Add2Ptr(t, done), page_address(page),
+		       min(PAGE_SIZE, tail));
+		ntfs_unmap_page(page);
+
+		if (!idx && ATTR_STD != t->type) {
+			err = -EINVAL;
+			goto out;
+		}
+	}
+
+	t += 1;
+	sbi->def_entries = 1;
+	done = sizeof(struct ATTR_DEF_ENTRY);
+	sbi->reparse.max_size = MAXIMUM_REPARSE_DATA_BUFFER_SIZE;
+
+	while (done + sizeof(struct ATTR_DEF_ENTRY) <= bytes) {
+		u32 t32 = le32_to_cpu(t->type);
+
+		if ((t32 & 0xF) || le32_to_cpu(t[-1].type) >= t32)
+			break;
+
+		if (t->type == ATTR_REPARSE)
+			sbi->reparse.max_size = le64_to_cpu(t->max_sz);
+
+		done += sizeof(struct ATTR_DEF_ENTRY);
+		t += 1;
+		sbi->def_entries += 1;
+	}
+	iput(inode);
+
+	/* Load $UpCase */
+	ref.low = cpu_to_le32(MFT_REC_UPCASE);
+	ref.seq = cpu_to_le16(MFT_REC_UPCASE);
+	inode = ntfs_iget5(sb, &ref, &NAME_UPCASE);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load $LogFile.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	if (inode->i_size != 0x10000 * sizeof(short)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	sbi->upcase = upcase = ntfs_malloc(0x10000 * sizeof(short));
+	if (!upcase) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	for (idx = 0; idx < (0x10000 * sizeof(short) >> PAGE_SHIFT); idx++) {
+		const __le16 *src;
+		u16 *dst = Add2Ptr(upcase, idx << PAGE_SHIFT);
+		struct page *page = ntfs_map_page(inode->i_mapping, idx);
+
+		if (IS_ERR(page)) {
+			err = PTR_ERR(page);
+			goto out;
+		}
+
+		src = page_address(page);
+
+#ifdef __BIG_ENDIAN
+		for (i = 0; i < PAGE_SIZE / sizeof(u16); i++)
+			*dst++ = le16_to_cpu(*src++);
+#else
+		memcpy(dst, src, PAGE_SIZE);
+#endif
+		ntfs_unmap_page(page);
+	}
+
+	shared = ntfs_set_shared(upcase, 0x10000 * sizeof(short));
+	if (shared && upcase != shared) {
+		sbi->upcase = shared;
+		ntfs_free(upcase);
+	}
+
+	iput(inode);
+	inode = NULL;
+
+	if (is_ntfs3(sbi)) {
+		/* Load $Secure */
+		err = ntfs_security_init(sbi);
+		if (err)
+			goto out;
+
+		/* Load $Extend */
+		err = ntfs_extend_init(sbi);
+		if (err)
+			goto load_root;
+
+		/* Load $Extend\$Reparse */
+		err = ntfs_reparse_init(sbi);
+		if (err)
+			goto load_root;
+
+		/* Load $Extend\$ObjId */
+		err = ntfs_objid_init(sbi);
+		if (err)
+			goto load_root;
+	}
+
+load_root:
+
+	/* Load root */
+	ref.low = cpu_to_le32(MFT_REC_ROOT);
+	ref.seq = cpu_to_le16(MFT_REC_ROOT);
+	inode = ntfs_iget5(sb, &ref, &NAME_ROOT);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		ntfs_err(sb, "Failed to load root.");
+		inode = NULL;
+		goto out;
+	}
+
+	ni = ntfs_i(inode);
+
+	sb->s_root = d_make_root(inode);
+
+	if (!sb->s_root) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	return 0;
+
+out:
+	iput(inode);
+
+	if (sb->s_root) {
+		d_drop(sb->s_root);
+		sb->s_root = NULL;
+	}
+
+	put_ntfs(sbi);
+
+	sb->s_fs_info = NULL;
+	return err;
+}
+
+void ntfs_unmap_meta(struct super_block *sb, CLST lcn, CLST len)
+{
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	struct block_device *bdev = sb->s_bdev;
+	sector_t devblock = (u64)lcn * sbi->blocks_per_cluster;
+	unsigned long blocks = (u64)len * sbi->blocks_per_cluster;
+	unsigned long cnt = 0;
+	unsigned long limit = global_zone_page_state(NR_FREE_PAGES)
+			      << (PAGE_SHIFT - sb->s_blocksize_bits);
+
+	if (limit >= 0x2000)
+		limit -= 0x1000;
+	else if (limit < 32)
+		limit = 32;
+	else
+		limit >>= 1;
+
+	while (blocks--) {
+		clean_bdev_aliases(bdev, devblock++, 1);
+		if (cnt++ >= limit) {
+			sync_blockdev(bdev);
+			cnt = 0;
+		}
+	}
+}
+
+/*
+ * ntfs_discard
+ *
+ * issue a discard request (trim for SSD)
+ */
+int ntfs_discard(struct ntfs_sb_info *sbi, CLST lcn, CLST len)
+{
+	int err;
+	u64 lbo, bytes, start, end;
+	struct super_block *sb;
+
+	if (sbi->used.next_free_lcn == lcn + len)
+		sbi->used.next_free_lcn = lcn;
+
+	if (sbi->flags & NTFS_FLAGS_NODISCARD)
+		return -EOPNOTSUPP;
+
+	if (!sbi->options.discard)
+		return -EOPNOTSUPP;
+
+	lbo = (u64)lcn << sbi->cluster_bits;
+	bytes = (u64)len << sbi->cluster_bits;
+
+	/* Align up 'start' on discard_granularity */
+	start = (lbo + sbi->discard_granularity - 1) &
+		sbi->discard_granularity_mask_inv;
+	/* Align down 'end' on discard_granularity */
+	end = (lbo + bytes) & sbi->discard_granularity_mask_inv;
+
+	sb = sbi->sb;
+	if (start >= end)
+		return 0;
+
+	err = blkdev_issue_discard(sb->s_bdev, start >> 9, (end - start) >> 9,
+				   GFP_NOFS, 0);
+
+	if (err == -EOPNOTSUPP)
+		sbi->flags |= NTFS_FLAGS_NODISCARD;
+
+	return err;
+}
+
+static struct dentry *ntfs_mount(struct file_system_type *fs_type, int flags,
+				 const char *dev_name, void *data)
+{
+	return mount_bdev(fs_type, flags, dev_name, data, ntfs_fill_super);
+}
+
+static struct file_system_type ntfs_fs_type = {
+	.owner = THIS_MODULE,
+	.name = "ntfs3",
+	.mount = ntfs_mount,
+	.kill_sb = kill_block_super,
+	.fs_flags = FS_REQUIRES_DEV,
+};
+
+static int __init init_ntfs_fs(void)
+{
+	int err;
+
+#ifdef NTFS3_INDEX_BINARY_SEARCH
+	pr_notice("ntfs3: +index binary search\n");
+#endif
+
+#ifdef NTFS3_CHECK_FREE_CLST
+	pr_notice("ntfs3: +check free clusters\n");
+#endif
+
+#if NTFS_LINK_MAX < 0xffff
+	pr_notice("ntfs3: max link count %u\n", NTFS_LINK_MAX);
+#endif
+
+#ifdef NTFS3_64BIT_CLUSTER
+	pr_notice("ntfs3: 64 bits per cluster\n");
+#else
+	pr_notice("ntfs3: 32 bits per cluster\n");
+#endif
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+	pr_notice("ntfs3: read-only lzx/xpress compression included\n");
+#endif
+
+	ntfs_inode_cachep = kmem_cache_create(
+		"ntfs_inode_cache", sizeof(struct ntfs_inode), 0,
+		(SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT),
+		init_once);
+	if (!ntfs_inode_cachep) {
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	err = register_filesystem(&ntfs_fs_type);
+	if (!err)
+		return 0;
+
+	kmem_cache_destroy(ntfs_inode_cachep);
+
+failed:
+	return err;
+}
+
+static void __exit exit_ntfs_fs(void)
+{
+	if (ntfs_inode_cachep) {
+		rcu_barrier();
+		kmem_cache_destroy(ntfs_inode_cachep);
+	}
+
+	unregister_filesystem(&ntfs_fs_type);
+}
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("ntfs3 filesystem");
+MODULE_AUTHOR("Konstantin Komarov");
+MODULE_ALIAS_FS("ntfs3");
+
+module_init(init_ntfs_fs);
+module_exit(exit_ntfs_fs);

From patchwork Thu Jan 28 09:04:48 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053207
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6F886C433E0
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:18 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 1FB9B64DD1
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:18 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231615AbhA1JIC (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:02 -0500
Received: from relaydlg-01.paragon-software.com ([81.5.88.159]:37252 "EHLO
        relaydlg-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231745AbhA1JGN (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:13 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relaydlg-01.paragon-software.com (Postfix) with ESMTPS id
 0A88582269;
        Thu, 28 Jan 2021 12:05:03 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824703;
        bh=Ld/myPzfyWTIKU6F/349QXz9FIwOD/3wFvrKLw9xnno=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=o4v0tUvqoc/tcyfJ5YXJbpkZn95AsXykHn3lksx9KzclsKLpqKWhJFoG5ifszgtLy
         jReLNftGmW5ohUJqzdqPifTKaAa1o5qDr9P9cQb0mfqkLqcMvRG5ObqO1ZbFIiJkFk
         rYEj5fgCh45EyN37OoJvhu3OZsTuRMMKH8EI7v+o=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:02 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 03/10] fs/ntfs3: Add bitmap
Date: Thu, 28 Jan 2021 12:04:48 +0300
Message-ID: 
 <20210128090455.3576502-4-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds bitmap

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/ntfs3/bitfunc.c |  135 ++++
 fs/ntfs3/bitmap.c  | 1506 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 1641 insertions(+)
 create mode 100644 fs/ntfs3/bitfunc.c
 create mode 100644 fs/ntfs3/bitmap.c

diff --git a/fs/ntfs3/bitfunc.c b/fs/ntfs3/bitfunc.c
new file mode 100644
index 000000000000..2d43d718d645
--- /dev/null
+++ b/fs/ntfs3/bitfunc.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+#define BITS_IN_SIZE_T (sizeof(size_t) * 8)
+
+/*
+ * fill_mask[i] - first i bits are '1' , i = 0,1,2,3,4,5,6,7,8
+ * fill_mask[i] = 0xFF >> (8-i)
+ */
+static const u8 fill_mask[] = { 0x00, 0x01, 0x03, 0x07, 0x0F,
+				0x1F, 0x3F, 0x7F, 0xFF };
+
+/*
+ * zero_mask[i] - first i bits are '0' , i = 0,1,2,3,4,5,6,7,8
+ * zero_mask[i] = 0xFF << i
+ */
+static const u8 zero_mask[] = { 0xFF, 0xFE, 0xFC, 0xF8, 0xF0,
+				0xE0, 0xC0, 0x80, 0x00 };
+
+/*
+ * are_bits_clear
+ *
+ * Returns true if all bits [bit, bit+nbits) are zeros "0"
+ */
+bool are_bits_clear(const ulong *lmap, size_t bit, size_t nbits)
+{
+	size_t pos = bit & 7;
+	const u8 *map = (u8 *)lmap + (bit >> 3);
+
+	if (pos) {
+		if (8 - pos >= nbits)
+			return !nbits || !(*map & fill_mask[pos + nbits] &
+					   zero_mask[pos]);
+
+		if (*map++ & zero_mask[pos])
+			return false;
+		nbits -= 8 - pos;
+	}
+
+	pos = ((size_t)map) & (sizeof(size_t) - 1);
+	if (pos) {
+		pos = sizeof(size_t) - pos;
+		if (nbits >= pos * 8) {
+			for (nbits -= pos * 8; pos; pos--, map++) {
+				if (*map)
+					return false;
+			}
+		}
+	}
+
+	for (pos = nbits / BITS_IN_SIZE_T; pos; pos--, map += sizeof(size_t)) {
+		if (*((size_t *)map))
+			return false;
+	}
+
+	for (pos = (nbits % BITS_IN_SIZE_T) >> 3; pos; pos--, map++) {
+		if (*map)
+			return false;
+	}
+
+	pos = nbits & 7;
+	if (pos && (*map & fill_mask[pos]))
+		return false;
+
+	// All bits are zero
+	return true;
+}
+
+/*
+ * are_bits_set
+ *
+ * Returns true if all bits [bit, bit+nbits) are ones "1"
+ */
+bool are_bits_set(const ulong *lmap, size_t bit, size_t nbits)
+{
+	u8 mask;
+	size_t pos = bit & 7;
+	const u8 *map = (u8 *)lmap + (bit >> 3);
+
+	if (pos) {
+		if (8 - pos >= nbits) {
+			mask = fill_mask[pos + nbits] & zero_mask[pos];
+			return !nbits || (*map & mask) == mask;
+		}
+
+		mask = zero_mask[pos];
+		if ((*map++ & mask) != mask)
+			return false;
+		nbits -= 8 - pos;
+	}
+
+	pos = ((size_t)map) & (sizeof(size_t) - 1);
+	if (pos) {
+		pos = sizeof(size_t) - pos;
+		if (nbits >= pos * 8) {
+			for (nbits -= pos * 8; pos; pos--, map++) {
+				if (*map != 0xFF)
+					return false;
+			}
+		}
+	}
+
+	for (pos = nbits / BITS_IN_SIZE_T; pos; pos--, map += sizeof(size_t)) {
+		if (*((size_t *)map) != MINUS_ONE_T)
+			return false;
+	}
+
+	for (pos = (nbits % BITS_IN_SIZE_T) >> 3; pos; pos--, map++) {
+		if (*map != 0xFF)
+			return false;
+	}
+
+	pos = nbits & 7;
+	if (pos) {
+		u8 mask = fill_mask[pos];
+
+		if ((*map & mask) != mask)
+			return false;
+	}
+
+	// All bits are ones
+	return true;
+}
diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
new file mode 100644
index 000000000000..1a4d65a6aac1
--- /dev/null
+++ b/fs/ntfs3/bitmap.c
@@ -0,0 +1,1506 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+struct rb_node_key {
+	struct rb_node node;
+	size_t key;
+};
+
+/*
+ * Tree is sorted by start (key)
+ */
+struct e_node {
+	struct rb_node_key start; /* Tree sorted by start */
+	struct rb_node_key count; /* Tree sorted by len*/
+};
+
+static int wnd_rescan(struct wnd_bitmap *wnd);
+static struct buffer_head *wnd_map(struct wnd_bitmap *wnd, size_t iw);
+static bool wnd_is_free_hlp(struct wnd_bitmap *wnd, size_t bit, size_t bits);
+
+static inline u32 wnd_bits(const struct wnd_bitmap *wnd, size_t i)
+{
+	return i + 1 == wnd->nwnd ? wnd->bits_last : wnd->sb->s_blocksize * 8;
+}
+
+/*
+ * b_pos + b_len - biggest fragment
+ * Scan range [wpos wbits) window 'buf'
+ * Returns -1 if not found
+ */
+static size_t wnd_scan(const ulong *buf, size_t wbit, u32 wpos, u32 wend,
+		       size_t to_alloc, size_t *prev_tail, size_t *b_pos,
+		       size_t *b_len)
+{
+	while (wpos < wend) {
+		size_t free_len;
+		u32 free_bits, end;
+		u32 used = find_next_zero_bit(buf, wend, wpos);
+
+		if (used >= wend) {
+			if (*b_len < *prev_tail) {
+				*b_pos = wbit - *prev_tail;
+				*b_len = *prev_tail;
+			}
+
+			*prev_tail = 0;
+			return -1;
+		}
+
+		if (used > wpos) {
+			wpos = used;
+			if (*b_len < *prev_tail) {
+				*b_pos = wbit - *prev_tail;
+				*b_len = *prev_tail;
+			}
+
+			*prev_tail = 0;
+		}
+
+		/*
+		 * Now we have a fragment [wpos, wend) staring with 0
+		 */
+		end = wpos + to_alloc - *prev_tail;
+		free_bits = find_next_bit(buf, min(end, wend), wpos);
+
+		free_len = *prev_tail + free_bits - wpos;
+
+		if (*b_len < free_len) {
+			*b_pos = wbit + wpos - *prev_tail;
+			*b_len = free_len;
+		}
+
+		if (free_len >= to_alloc)
+			return wbit + wpos - *prev_tail;
+
+		if (free_bits >= wend) {
+			*prev_tail += free_bits - wpos;
+			return -1;
+		}
+
+		wpos = free_bits + 1;
+
+		*prev_tail = 0;
+	}
+
+	return -1;
+}
+
+/*
+ * wnd_close
+ *
+ *
+ */
+void wnd_close(struct wnd_bitmap *wnd)
+{
+	struct rb_node *node, *next;
+
+	if (wnd->free_bits != wnd->free_holder)
+		ntfs_free(wnd->free_bits);
+	run_close(&wnd->run);
+
+	node = rb_first(&wnd->start_tree);
+
+	while (node) {
+		next = rb_next(node);
+		rb_erase(node, &wnd->start_tree);
+		ntfs_free(rb_entry(node, struct e_node, start.node));
+		node = next;
+	}
+}
+
+static struct rb_node *rb_lookup(struct rb_root *root, size_t v)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node *r = NULL;
+
+	while (*p) {
+		struct rb_node_key *k;
+
+		k = rb_entry(*p, struct rb_node_key, node);
+		if (v < k->key) {
+			p = &(*p)->rb_left;
+		} else if (v > k->key) {
+			r = &k->node;
+			p = &(*p)->rb_right;
+		} else {
+			return &k->node;
+		}
+	}
+
+	return r;
+}
+
+/*
+ * rb_insert_count
+ *
+ * Helper function to insert special kind of 'count' tree
+ */
+static inline bool rb_insert_count(struct rb_root *root, struct e_node *e)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node *parent = NULL;
+	size_t e_ckey = e->count.key;
+	size_t e_skey = e->start.key;
+
+	while (*p) {
+		struct e_node *k =
+			rb_entry(parent = *p, struct e_node, count.node);
+
+		if (e_ckey > k->count.key) {
+			p = &(*p)->rb_left;
+		} else if (e_ckey < k->count.key) {
+			p = &(*p)->rb_right;
+		} else if (e_skey < k->start.key) {
+			p = &(*p)->rb_left;
+		} else if (e_skey > k->start.key) {
+			p = &(*p)->rb_right;
+		} else {
+			WARN_ON(1);
+			return false;
+		}
+	}
+
+	rb_link_node(&e->count.node, parent, p);
+	rb_insert_color(&e->count.node, root);
+	return true;
+}
+
+/*
+ * inline bool rb_insert_start
+ *
+ * Helper function to insert special kind of 'count' tree
+ */
+static inline bool rb_insert_start(struct rb_root *root, struct e_node *e)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node *parent = NULL;
+	size_t e_skey = e->start.key;
+
+	while (*p) {
+		struct e_node *k;
+
+		parent = *p;
+
+		k = rb_entry(parent, struct e_node, start.node);
+		if (e_skey < k->start.key) {
+			p = &(*p)->rb_left;
+		} else if (e_skey > k->start.key) {
+			p = &(*p)->rb_right;
+		} else {
+			WARN_ON(1);
+			return false;
+		}
+	}
+
+	rb_link_node(&e->start.node, parent, p);
+	rb_insert_color(&e->start.node, root);
+	return true;
+}
+
+#define NTFS_MAX_WND_EXTENTS (32u * 1024u)
+
+/*
+ * wnd_add_free_ext
+ *
+ * adds a new extent of free space
+ * build = 1 when building tree
+ */
+static void wnd_add_free_ext(struct wnd_bitmap *wnd, size_t bit, size_t len,
+			     bool build)
+{
+	struct e_node *e, *e0 = NULL;
+	size_t ib, end_in = bit + len;
+	struct rb_node *n;
+
+	if (build) {
+		/* Use extent_min to filter too short extents */
+		if (wnd->count >= NTFS_MAX_WND_EXTENTS &&
+		    len <= wnd->extent_min) {
+			wnd->uptodated = -1;
+			return;
+		}
+	} else {
+		/* Try to find extent before 'bit' */
+		n = rb_lookup(&wnd->start_tree, bit);
+
+		if (!n) {
+			n = rb_first(&wnd->start_tree);
+		} else {
+			e = rb_entry(n, struct e_node, start.node);
+			n = rb_next(n);
+			if (e->start.key + e->count.key == bit) {
+				/* Remove left */
+				bit = e->start.key;
+				len += e->count.key;
+				rb_erase(&e->start.node, &wnd->start_tree);
+				rb_erase(&e->count.node, &wnd->count_tree);
+				wnd->count -= 1;
+				e0 = e;
+			}
+		}
+
+		while (n) {
+			size_t next_end;
+
+			e = rb_entry(n, struct e_node, start.node);
+			next_end = e->start.key + e->count.key;
+			if (e->start.key > end_in)
+				break;
+
+			/* Remove right */
+			n = rb_next(n);
+			len += next_end - end_in;
+			end_in = next_end;
+			rb_erase(&e->start.node, &wnd->start_tree);
+			rb_erase(&e->count.node, &wnd->count_tree);
+			wnd->count -= 1;
+
+			if (!e0)
+				e0 = e;
+			else
+				ntfs_free(e);
+		}
+
+		if (wnd->uptodated != 1) {
+			/* Check bits before 'bit' */
+			ib = wnd->zone_bit == wnd->zone_end ||
+					     bit < wnd->zone_end ?
+				     0 :
+				     wnd->zone_end;
+
+			while (bit > ib && wnd_is_free_hlp(wnd, bit - 1, 1)) {
+				bit -= 1;
+				len += 1;
+			}
+
+			/* Check bits after 'end_in' */
+			ib = wnd->zone_bit == wnd->zone_end ||
+					     end_in > wnd->zone_bit ?
+				     wnd->nbits :
+				     wnd->zone_bit;
+
+			while (end_in < ib && wnd_is_free_hlp(wnd, end_in, 1)) {
+				end_in += 1;
+				len += 1;
+			}
+		}
+	}
+	/* Insert new fragment */
+	if (wnd->count >= NTFS_MAX_WND_EXTENTS) {
+		if (e0)
+			ntfs_free(e0);
+
+		wnd->uptodated = -1;
+
+		/* Compare with smallest fragment */
+		n = rb_last(&wnd->count_tree);
+		e = rb_entry(n, struct e_node, count.node);
+		if (len <= e->count.key)
+			goto out; /* Do not insert small fragments */
+
+		if (build) {
+			struct e_node *e2;
+
+			n = rb_prev(n);
+			e2 = rb_entry(n, struct e_node, count.node);
+			/* smallest fragment will be 'e2->count.key' */
+			wnd->extent_min = e2->count.key;
+		}
+
+		/* Replace smallest fragment by new one */
+		rb_erase(&e->start.node, &wnd->start_tree);
+		rb_erase(&e->count.node, &wnd->count_tree);
+		wnd->count -= 1;
+	} else {
+		e = e0 ? e0 : ntfs_malloc(sizeof(struct e_node));
+		if (!e) {
+			wnd->uptodated = -1;
+			goto out;
+		}
+
+		if (build && len <= wnd->extent_min)
+			wnd->extent_min = len;
+	}
+	e->start.key = bit;
+	e->count.key = len;
+	if (len > wnd->extent_max)
+		wnd->extent_max = len;
+
+	rb_insert_start(&wnd->start_tree, e);
+	rb_insert_count(&wnd->count_tree, e);
+	wnd->count += 1;
+
+out:;
+}
+
+/*
+ * wnd_remove_free_ext
+ *
+ * removes a run from the cached free space
+ */
+static void wnd_remove_free_ext(struct wnd_bitmap *wnd, size_t bit, size_t len)
+{
+	struct rb_node *n, *n3;
+	struct e_node *e, *e3;
+	size_t end_in = bit + len;
+	size_t end3, end, new_key, new_len, max_new_len;
+
+	/* Try to find extent before 'bit' */
+	n = rb_lookup(&wnd->start_tree, bit);
+
+	if (!n)
+		return;
+
+	e = rb_entry(n, struct e_node, start.node);
+	end = e->start.key + e->count.key;
+
+	new_key = new_len = 0;
+	len = e->count.key;
+
+	/* Range [bit,end_in) must be inside 'e' or outside 'e' and 'n' */
+	if (e->start.key > bit)
+		;
+	else if (end_in <= end) {
+		/* Range [bit,end_in) inside 'e' */
+		new_key = end_in;
+		new_len = end - end_in;
+		len = bit - e->start.key;
+	} else if (bit > end) {
+		bool bmax = false;
+
+		n3 = rb_next(n);
+
+		while (n3) {
+			e3 = rb_entry(n3, struct e_node, start.node);
+			if (e3->start.key >= end_in)
+				break;
+
+			if (e3->count.key == wnd->extent_max)
+				bmax = true;
+
+			end3 = e3->start.key + e3->count.key;
+			if (end3 > end_in) {
+				e3->start.key = end_in;
+				rb_erase(&e3->count.node, &wnd->count_tree);
+				e3->count.key = end3 - end_in;
+				rb_insert_count(&wnd->count_tree, e3);
+				break;
+			}
+
+			n3 = rb_next(n3);
+			rb_erase(&e3->start.node, &wnd->start_tree);
+			rb_erase(&e3->count.node, &wnd->count_tree);
+			wnd->count -= 1;
+			ntfs_free(e3);
+		}
+		if (!bmax)
+			return;
+		n3 = rb_first(&wnd->count_tree);
+		wnd->extent_max =
+			n3 ? rb_entry(n3, struct e_node, count.node)->count.key :
+			     0;
+		return;
+	}
+
+	if (e->count.key != wnd->extent_max) {
+		;
+	} else if (rb_prev(&e->count.node)) {
+		;
+	} else {
+		n3 = rb_next(&e->count.node);
+		max_new_len = len > new_len ? len : new_len;
+		if (!n3) {
+			wnd->extent_max = max_new_len;
+		} else {
+			e3 = rb_entry(n3, struct e_node, count.node);
+			wnd->extent_max = max(e3->count.key, max_new_len);
+		}
+	}
+
+	if (!len) {
+		if (new_len) {
+			e->start.key = new_key;
+			rb_erase(&e->count.node, &wnd->count_tree);
+			e->count.key = new_len;
+			rb_insert_count(&wnd->count_tree, e);
+		} else {
+			rb_erase(&e->start.node, &wnd->start_tree);
+			rb_erase(&e->count.node, &wnd->count_tree);
+			wnd->count -= 1;
+			ntfs_free(e);
+		}
+		goto out;
+	}
+	rb_erase(&e->count.node, &wnd->count_tree);
+	e->count.key = len;
+	rb_insert_count(&wnd->count_tree, e);
+
+	if (!new_len)
+		goto out;
+
+	if (wnd->count >= NTFS_MAX_WND_EXTENTS) {
+		wnd->uptodated = -1;
+
+		/* Get minimal extent */
+		e = rb_entry(rb_last(&wnd->count_tree), struct e_node,
+			     count.node);
+		if (e->count.key > new_len)
+			goto out;
+
+		/* Replace minimum */
+		rb_erase(&e->start.node, &wnd->start_tree);
+		rb_erase(&e->count.node, &wnd->count_tree);
+		wnd->count -= 1;
+	} else {
+		e = ntfs_malloc(sizeof(struct e_node));
+		if (!e)
+			wnd->uptodated = -1;
+	}
+
+	if (e) {
+		e->start.key = new_key;
+		e->count.key = new_len;
+		rb_insert_start(&wnd->start_tree, e);
+		rb_insert_count(&wnd->count_tree, e);
+		wnd->count += 1;
+	}
+
+out:
+	if (!wnd->count && 1 != wnd->uptodated)
+		wnd_rescan(wnd);
+}
+
+/*
+ * wnd_rescan
+ *
+ * Scan all bitmap. used while initialization.
+ */
+static int wnd_rescan(struct wnd_bitmap *wnd)
+{
+	int err = 0;
+	size_t prev_tail = 0;
+	struct super_block *sb = wnd->sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	u64 lbo, len = 0;
+	u32 blocksize = sb->s_blocksize;
+	u8 cluster_bits = sbi->cluster_bits;
+	u32 wbits = 8 * sb->s_blocksize;
+	u32 used, frb;
+	const ulong *buf;
+	size_t wpos, wbit, iw, vbo;
+	struct buffer_head *bh = NULL;
+	CLST lcn, clen;
+
+	wnd->uptodated = 0;
+	wnd->extent_max = 0;
+	wnd->extent_min = MINUS_ONE_T;
+	wnd->total_zeroes = 0;
+
+	vbo = 0;
+
+	for (iw = 0; iw < wnd->nwnd; iw++) {
+		if (iw + 1 == wnd->nwnd)
+			wbits = wnd->bits_last;
+
+		if (wnd->inited) {
+			if (!wnd->free_bits[iw]) {
+				/* all ones */
+				if (prev_tail) {
+					wnd_add_free_ext(wnd,
+							 vbo * 8 - prev_tail,
+							 prev_tail, true);
+					prev_tail = 0;
+				}
+				goto next_wnd;
+			}
+			if (wbits == wnd->free_bits[iw]) {
+				/* all zeroes */
+				prev_tail += wbits;
+				wnd->total_zeroes += wbits;
+				goto next_wnd;
+			}
+		}
+
+		if (!len) {
+			u32 off = vbo & sbi->cluster_mask;
+
+			if (!run_lookup_entry(&wnd->run, vbo >> cluster_bits,
+					      &lcn, &clen, NULL)) {
+				err = -ENOENT;
+				goto out;
+			}
+
+			lbo = ((u64)lcn << cluster_bits) + off;
+			len = ((u64)clen << cluster_bits) - off;
+		}
+
+		bh = ntfs_bread(sb, lbo >> sb->s_blocksize_bits);
+		if (!bh) {
+			err = -EIO;
+			goto out;
+		}
+
+		buf = (ulong *)bh->b_data;
+
+		used = __bitmap_weight(buf, wbits);
+		if (used < wbits) {
+			frb = wbits - used;
+			wnd->free_bits[iw] = frb;
+			wnd->total_zeroes += frb;
+		}
+
+		wpos = 0;
+		wbit = vbo * 8;
+
+		if (wbit + wbits > wnd->nbits)
+			wbits = wnd->nbits - wbit;
+
+		do {
+			used = find_next_zero_bit(buf, wbits, wpos);
+
+			if (used > wpos && prev_tail) {
+				wnd_add_free_ext(wnd, wbit + wpos - prev_tail,
+						 prev_tail, true);
+				prev_tail = 0;
+			}
+
+			wpos = used;
+
+			if (wpos >= wbits) {
+				/* No free blocks */
+				prev_tail = 0;
+				break;
+			}
+
+			frb = find_next_bit(buf, wbits, wpos);
+			if (frb >= wbits) {
+				/* keep last free block */
+				prev_tail += frb - wpos;
+				break;
+			}
+
+			wnd_add_free_ext(wnd, wbit + wpos - prev_tail,
+					 frb + prev_tail - wpos, true);
+
+			/* Skip free block and first '1' */
+			wpos = frb + 1;
+			/* Reset previous tail */
+			prev_tail = 0;
+		} while (wpos < wbits);
+
+next_wnd:
+
+		if (bh)
+			put_bh(bh);
+		bh = NULL;
+
+		vbo += blocksize;
+		if (len) {
+			len -= blocksize;
+			lbo += blocksize;
+		}
+	}
+
+	/* Add last block */
+	if (prev_tail)
+		wnd_add_free_ext(wnd, wnd->nbits - prev_tail, prev_tail, true);
+
+	/*
+	 * Before init cycle wnd->uptodated was 0
+	 * If any errors or limits occurs while initialization then
+	 * wnd->uptodated will be -1
+	 * If 'uptodated' is still 0 then Tree is really updated
+	 */
+	if (!wnd->uptodated)
+		wnd->uptodated = 1;
+
+	if (wnd->zone_bit != wnd->zone_end) {
+		size_t zlen = wnd->zone_end - wnd->zone_bit;
+
+		wnd->zone_end = wnd->zone_bit;
+		wnd_zone_set(wnd, wnd->zone_bit, zlen);
+	}
+
+out:
+	return err;
+}
+
+/*
+ * wnd_init
+ */
+int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits)
+{
+	int err;
+	u32 blocksize = sb->s_blocksize;
+	u32 wbits = blocksize * 8;
+
+	init_rwsem(&wnd->rw_lock);
+
+	wnd->sb = sb;
+	wnd->nbits = nbits;
+	wnd->total_zeroes = nbits;
+	wnd->extent_max = MINUS_ONE_T;
+	wnd->zone_bit = wnd->zone_end = 0;
+	wnd->nwnd = bytes_to_block(sb, bitmap_size(nbits));
+	wnd->bits_last = nbits & (wbits - 1);
+	if (!wnd->bits_last)
+		wnd->bits_last = wbits;
+
+	if (wnd->nwnd <= ARRAY_SIZE(wnd->free_holder)) {
+		wnd->free_bits = wnd->free_holder;
+	} else {
+		wnd->free_bits = ntfs_zalloc(wnd->nwnd * sizeof(u16));
+		if (!wnd->free_bits)
+			return -ENOMEM;
+	}
+
+	err = wnd_rescan(wnd);
+	if (err)
+		return err;
+
+	wnd->inited = true;
+
+	return 0;
+}
+
+/*
+ * wnd_map
+ *
+ * call sb_bread for requested window
+ */
+static struct buffer_head *wnd_map(struct wnd_bitmap *wnd, size_t iw)
+{
+	size_t vbo;
+	CLST lcn, clen;
+	struct super_block *sb = wnd->sb;
+	struct ntfs_sb_info *sbi;
+	struct buffer_head *bh;
+	u64 lbo;
+
+	sbi = sb->s_fs_info;
+	vbo = (u64)iw << sb->s_blocksize_bits;
+
+	if (!run_lookup_entry(&wnd->run, vbo >> sbi->cluster_bits, &lcn, &clen,
+			      NULL)) {
+		return ERR_PTR(-ENOENT);
+	}
+
+	lbo = ((u64)lcn << sbi->cluster_bits) + (vbo & sbi->cluster_mask);
+
+	bh = ntfs_bread(wnd->sb, lbo >> sb->s_blocksize_bits);
+	if (!bh)
+		return ERR_PTR(-EIO);
+
+	return bh;
+}
+
+/*
+ * wnd_set_free
+ *
+ * Marks the bits range from bit to bit + bits as free
+ */
+int wnd_set_free(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+	int err = 0;
+	struct super_block *sb = wnd->sb;
+	size_t bits0 = bits;
+	u32 wbits = 8 * sb->s_blocksize;
+	size_t iw = bit >> (sb->s_blocksize_bits + 3);
+	u32 wbit = bit & (wbits - 1);
+	struct buffer_head *bh;
+
+	while (iw < wnd->nwnd && bits) {
+		u32 tail, op;
+		ulong *buf;
+
+		if (iw + 1 == wnd->nwnd)
+			wbits = wnd->bits_last;
+
+		tail = wbits - wbit;
+		op = tail < bits ? tail : bits;
+
+		bh = wnd_map(wnd, iw);
+		if (IS_ERR(bh)) {
+			err = PTR_ERR(bh);
+			break;
+		}
+
+		buf = (ulong *)bh->b_data;
+
+		lock_buffer(bh);
+
+		__bitmap_clear(buf, wbit, op);
+
+		wnd->free_bits[iw] += op;
+
+		set_buffer_uptodate(bh);
+		mark_buffer_dirty(bh);
+		unlock_buffer(bh);
+		put_bh(bh);
+
+		wnd->total_zeroes += op;
+		bits -= op;
+		wbit = 0;
+		iw += 1;
+	}
+
+	wnd_add_free_ext(wnd, bit, bits0, false);
+
+	return err;
+}
+
+/*
+ * wnd_set_used
+ *
+ * Marks the bits range from bit to bit + bits as used
+ */
+int wnd_set_used(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+	int err = 0;
+	struct super_block *sb = wnd->sb;
+	size_t bits0 = bits;
+	size_t iw = bit >> (sb->s_blocksize_bits + 3);
+	u32 wbits = 8 * sb->s_blocksize;
+	u32 wbit = bit & (wbits - 1);
+	struct buffer_head *bh;
+
+	while (iw < wnd->nwnd && bits) {
+		u32 tail, op;
+		ulong *buf;
+
+		if (unlikely(iw + 1 == wnd->nwnd))
+			wbits = wnd->bits_last;
+
+		tail = wbits - wbit;
+		op = tail < bits ? tail : bits;
+
+		bh = wnd_map(wnd, iw);
+		if (IS_ERR(bh)) {
+			err = PTR_ERR(bh);
+			break;
+		}
+		buf = (ulong *)bh->b_data;
+
+		lock_buffer(bh);
+
+		__bitmap_set(buf, wbit, op);
+		wnd->free_bits[iw] -= op;
+
+		set_buffer_uptodate(bh);
+		mark_buffer_dirty(bh);
+		unlock_buffer(bh);
+		put_bh(bh);
+
+		wnd->total_zeroes -= op;
+		bits -= op;
+		wbit = 0;
+		iw += 1;
+	}
+
+	if (!RB_EMPTY_ROOT(&wnd->start_tree))
+		wnd_remove_free_ext(wnd, bit, bits0);
+
+	return err;
+}
+
+/*
+ * wnd_is_free_hlp
+ *
+ * Returns true if all clusters [bit, bit+bits) are free (bitmap only)
+ */
+static bool wnd_is_free_hlp(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+	struct super_block *sb = wnd->sb;
+	size_t iw = bit >> (sb->s_blocksize_bits + 3);
+	u32 wbits = 8 * sb->s_blocksize;
+	u32 wbit = bit & (wbits - 1);
+
+	while (iw < wnd->nwnd && bits) {
+		u32 tail, op;
+
+		if (unlikely(iw + 1 == wnd->nwnd))
+			wbits = wnd->bits_last;
+
+		tail = wbits - wbit;
+		op = tail < bits ? tail : bits;
+
+		if (wbits != wnd->free_bits[iw]) {
+			bool ret;
+			struct buffer_head *bh = wnd_map(wnd, iw);
+
+			if (IS_ERR(bh))
+				return false;
+
+			ret = are_bits_clear((ulong *)bh->b_data, wbit, op);
+
+			put_bh(bh);
+			if (!ret)
+				return false;
+		}
+
+		bits -= op;
+		wbit = 0;
+		iw += 1;
+	}
+
+	return true;
+}
+
+/*
+ * wnd_is_free
+ *
+ * Returns true if all clusters [bit, bit+bits) are free
+ */
+bool wnd_is_free(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+	bool ret;
+	struct rb_node *n;
+	size_t end;
+	struct e_node *e;
+
+	if (RB_EMPTY_ROOT(&wnd->start_tree))
+		goto use_wnd;
+
+	n = rb_lookup(&wnd->start_tree, bit);
+	if (!n)
+		goto use_wnd;
+
+	e = rb_entry(n, struct e_node, start.node);
+
+	end = e->start.key + e->count.key;
+
+	if (bit < end && bit + bits <= end)
+		return true;
+
+use_wnd:
+	ret = wnd_is_free_hlp(wnd, bit, bits);
+
+	return ret;
+}
+
+/*
+ * wnd_is_used
+ *
+ * Returns true if all clusters [bit, bit+bits) are used
+ */
+bool wnd_is_used(struct wnd_bitmap *wnd, size_t bit, size_t bits)
+{
+	bool ret = false;
+	struct super_block *sb = wnd->sb;
+	size_t iw = bit >> (sb->s_blocksize_bits + 3);
+	u32 wbits = 8 * sb->s_blocksize;
+	u32 wbit = bit & (wbits - 1);
+	size_t end;
+	struct rb_node *n;
+	struct e_node *e;
+
+	if (RB_EMPTY_ROOT(&wnd->start_tree))
+		goto use_wnd;
+
+	end = bit + bits;
+	n = rb_lookup(&wnd->start_tree, end - 1);
+	if (!n)
+		goto use_wnd;
+
+	e = rb_entry(n, struct e_node, start.node);
+	if (e->start.key + e->count.key > bit)
+		return false;
+
+use_wnd:
+	while (iw < wnd->nwnd && bits) {
+		u32 tail, op;
+
+		if (unlikely(iw + 1 == wnd->nwnd))
+			wbits = wnd->bits_last;
+
+		tail = wbits - wbit;
+		op = tail < bits ? tail : bits;
+
+		if (wnd->free_bits[iw]) {
+			bool ret;
+			struct buffer_head *bh = wnd_map(wnd, iw);
+
+			if (IS_ERR(bh))
+				goto out;
+
+			ret = are_bits_set((ulong *)bh->b_data, wbit, op);
+			put_bh(bh);
+			if (!ret)
+				goto out;
+		}
+
+		bits -= op;
+		wbit = 0;
+		iw += 1;
+	}
+	ret = true;
+
+out:
+	return ret;
+}
+
+/*
+ * wnd_find
+ * - flags - BITMAP_FIND_XXX flags
+ *
+ * looks for free space
+ * Returns 0 if not found
+ */
+size_t wnd_find(struct wnd_bitmap *wnd, size_t to_alloc, size_t hint,
+		size_t flags, size_t *allocated)
+{
+	struct super_block *sb;
+	u32 wbits, wpos, wzbit, wzend;
+	size_t fnd, max_alloc, b_len, b_pos;
+	size_t iw, prev_tail, nwnd, wbit, ebit, zbit, zend;
+	size_t to_alloc0 = to_alloc;
+	const ulong *buf;
+	const struct e_node *e;
+	const struct rb_node *pr, *cr;
+	u8 log2_bits;
+	bool fbits_valid;
+	struct buffer_head *bh;
+
+	/* fast checking for available free space */
+	if (flags & BITMAP_FIND_FULL) {
+		size_t zeroes = wnd_zeroes(wnd);
+
+		zeroes -= wnd->zone_end - wnd->zone_bit;
+		if (zeroes < to_alloc0)
+			goto no_space;
+
+		if (to_alloc0 > wnd->extent_max)
+			goto no_space;
+	} else {
+		if (to_alloc > wnd->extent_max)
+			to_alloc = wnd->extent_max;
+	}
+
+	if (wnd->zone_bit <= hint && hint < wnd->zone_end)
+		hint = wnd->zone_end;
+
+	max_alloc = wnd->nbits;
+	b_len = b_pos = 0;
+
+	if (hint >= max_alloc)
+		hint = 0;
+
+	if (RB_EMPTY_ROOT(&wnd->start_tree)) {
+		if (wnd->uptodated == 1) {
+			/* extents tree is updated -> no free space */
+			goto no_space;
+		}
+		goto scan_bitmap;
+	}
+
+	e = NULL;
+	if (!hint)
+		goto allocate_biggest;
+
+	/* Use hint: enumerate extents by start >= hint */
+	pr = NULL;
+	cr = wnd->start_tree.rb_node;
+
+	for (;;) {
+		e = rb_entry(cr, struct e_node, start.node);
+
+		if (e->start.key == hint)
+			break;
+
+		if (e->start.key < hint) {
+			pr = cr;
+			cr = cr->rb_right;
+			if (!cr)
+				break;
+			continue;
+		}
+
+		cr = cr->rb_left;
+		if (!cr) {
+			e = pr ? rb_entry(pr, struct e_node, start.node) : NULL;
+			break;
+		}
+	}
+
+	if (!e)
+		goto allocate_biggest;
+
+	if (e->start.key + e->count.key > hint) {
+		/* We have found extension with 'hint' inside */
+		size_t len = e->start.key + e->count.key - hint;
+
+		if (len >= to_alloc && hint + to_alloc <= max_alloc) {
+			fnd = hint;
+			goto found;
+		}
+
+		if (!(flags & BITMAP_FIND_FULL)) {
+			if (len > to_alloc)
+				len = to_alloc;
+
+			if (hint + len <= max_alloc) {
+				fnd = hint;
+				to_alloc = len;
+				goto found;
+			}
+		}
+	}
+
+allocate_biggest:
+	/* Allocate from biggest free extent */
+	e = rb_entry(rb_first(&wnd->count_tree), struct e_node, count.node);
+	if (e->count.key != wnd->extent_max)
+		wnd->extent_max = e->count.key;
+
+	if (e->count.key < max_alloc) {
+		if (e->count.key >= to_alloc) {
+			;
+		} else if (flags & BITMAP_FIND_FULL) {
+			if (e->count.key < to_alloc0) {
+				/* Biggest free block is less then requested */
+				goto no_space;
+			}
+			to_alloc = e->count.key;
+		} else if (-1 != wnd->uptodated) {
+			to_alloc = e->count.key;
+		} else {
+			/* Check if we can use more bits */
+			size_t op, max_check;
+			struct rb_root start_tree;
+
+			memcpy(&start_tree, &wnd->start_tree,
+			       sizeof(struct rb_root));
+			memset(&wnd->start_tree, 0, sizeof(struct rb_root));
+
+			max_check = e->start.key + to_alloc;
+			if (max_check > max_alloc)
+				max_check = max_alloc;
+			for (op = e->start.key + e->count.key; op < max_check;
+			     op++) {
+				if (!wnd_is_free(wnd, op, 1))
+					break;
+			}
+			memcpy(&wnd->start_tree, &start_tree,
+			       sizeof(struct rb_root));
+			to_alloc = op - e->start.key;
+		}
+
+		/* Prepare to return */
+		fnd = e->start.key;
+		if (e->start.key + to_alloc > max_alloc)
+			to_alloc = max_alloc - e->start.key;
+		goto found;
+	}
+
+	if (wnd->uptodated == 1) {
+		/* extents tree is updated -> no free space */
+		goto no_space;
+	}
+
+	b_len = e->count.key;
+	b_pos = e->start.key;
+
+scan_bitmap:
+	sb = wnd->sb;
+	log2_bits = sb->s_blocksize_bits + 3;
+
+	/* At most two ranges [hint, max_alloc) + [0, hint) */
+Again:
+
+	/* TODO: optimize request for case nbits > wbits */
+	iw = hint >> log2_bits;
+	wbits = sb->s_blocksize * 8;
+	wpos = hint & (wbits - 1);
+	prev_tail = 0;
+	fbits_valid = true;
+
+	if (max_alloc == wnd->nbits) {
+		nwnd = wnd->nwnd;
+	} else {
+		size_t t = max_alloc + wbits - 1;
+
+		nwnd = likely(t > max_alloc) ? (t >> log2_bits) : wnd->nwnd;
+	}
+
+	/* Enumerate all windows */
+	for (; iw < nwnd; iw++) {
+		wbit = iw << log2_bits;
+
+		if (!wnd->free_bits[iw]) {
+			if (prev_tail > b_len) {
+				b_pos = wbit - prev_tail;
+				b_len = prev_tail;
+			}
+
+			/* Skip full used window */
+			prev_tail = 0;
+			wpos = 0;
+			continue;
+		}
+
+		if (unlikely(iw + 1 == nwnd)) {
+			if (max_alloc == wnd->nbits) {
+				wbits = wnd->bits_last;
+			} else {
+				size_t t = max_alloc & (wbits - 1);
+
+				if (t) {
+					wbits = t;
+					fbits_valid = false;
+				}
+			}
+		}
+
+		if (wnd->zone_end > wnd->zone_bit) {
+			ebit = wbit + wbits;
+			zbit = max(wnd->zone_bit, wbit);
+			zend = min(wnd->zone_end, ebit);
+
+			/* Here we have a window [wbit, ebit) and zone [zbit, zend) */
+			if (zend <= zbit) {
+				/* Zone does not overlap window */
+			} else {
+				wzbit = zbit - wbit;
+				wzend = zend - wbit;
+
+				/* Zone overlaps window */
+				if (wnd->free_bits[iw] == wzend - wzbit) {
+					prev_tail = 0;
+					wpos = 0;
+					continue;
+				}
+
+				/* Scan two ranges window: [wbit, zbit) and [zend, ebit) */
+				bh = wnd_map(wnd, iw);
+
+				if (IS_ERR(bh)) {
+					/* TODO: error */
+					prev_tail = 0;
+					wpos = 0;
+					continue;
+				}
+
+				buf = (ulong *)bh->b_data;
+
+				/* Scan range [wbit, zbit) */
+				if (wpos < wzbit) {
+					/* Scan range [wpos, zbit) */
+					fnd = wnd_scan(buf, wbit, wpos, wzbit,
+						       to_alloc, &prev_tail,
+						       &b_pos, &b_len);
+					if (fnd != MINUS_ONE_T) {
+						put_bh(bh);
+						goto found;
+					}
+				}
+
+				prev_tail = 0;
+
+				/* Scan range [zend, ebit) */
+				if (wzend < wbits) {
+					fnd = wnd_scan(buf, wbit,
+						       max(wzend, wpos), wbits,
+						       to_alloc, &prev_tail,
+						       &b_pos, &b_len);
+					if (fnd != MINUS_ONE_T) {
+						put_bh(bh);
+						goto found;
+					}
+				}
+
+				wpos = 0;
+				put_bh(bh);
+				continue;
+			}
+		}
+
+		/* Current window does not overlap zone */
+		if (!wpos && fbits_valid && wnd->free_bits[iw] == wbits) {
+			/* window is empty */
+			if (prev_tail + wbits >= to_alloc) {
+				fnd = wbit + wpos - prev_tail;
+				goto found;
+			}
+
+			/* Increase 'prev_tail' and process next window */
+			prev_tail += wbits;
+			wpos = 0;
+			continue;
+		}
+
+		/* read window */
+		bh = wnd_map(wnd, iw);
+		if (IS_ERR(bh)) {
+			// TODO: error
+			prev_tail = 0;
+			wpos = 0;
+			continue;
+		}
+
+		buf = (ulong *)bh->b_data;
+
+		/* Scan range [wpos, eBits) */
+		fnd = wnd_scan(buf, wbit, wpos, wbits, to_alloc, &prev_tail,
+			       &b_pos, &b_len);
+		put_bh(bh);
+		if (fnd != MINUS_ONE_T)
+			goto found;
+	}
+
+	if (b_len < prev_tail) {
+		/* The last fragment */
+		b_len = prev_tail;
+		b_pos = max_alloc - prev_tail;
+	}
+
+	if (hint) {
+		/*
+		 * We have scanned range [hint max_alloc)
+		 * Prepare to scan range [0 hint + to_alloc)
+		 */
+		size_t nextmax = hint + to_alloc;
+
+		if (likely(nextmax >= hint) && nextmax < max_alloc)
+			max_alloc = nextmax;
+		hint = 0;
+		goto Again;
+	}
+
+	if (!b_len)
+		goto no_space;
+
+	wnd->extent_max = b_len;
+
+	if (flags & BITMAP_FIND_FULL)
+		goto no_space;
+
+	fnd = b_pos;
+	to_alloc = b_len;
+
+found:
+	if (flags & BITMAP_FIND_MARK_AS_USED) {
+		/* TODO optimize remove extent (pass 'e'?) */
+		if (wnd_set_used(wnd, fnd, to_alloc))
+			goto no_space;
+	} else if (wnd->extent_max != MINUS_ONE_T &&
+		   to_alloc > wnd->extent_max) {
+		wnd->extent_max = to_alloc;
+	}
+
+	*allocated = fnd;
+	return to_alloc;
+
+no_space:
+	return 0;
+}
+
+/*
+ * wnd_extend
+ *
+ * Extend bitmap ($MFT bitmap)
+ */
+int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
+{
+	int err;
+	struct super_block *sb = wnd->sb;
+	struct ntfs_sb_info *sbi = sb->s_fs_info;
+	u32 blocksize = sb->s_blocksize;
+	u32 wbits = blocksize * 8;
+	u32 b0, new_last;
+	size_t bits, iw, new_wnd;
+	size_t old_bits = wnd->nbits;
+	u16 *new_free;
+
+	if (new_bits <= old_bits)
+		return -EINVAL;
+
+	/* align to 8 byte boundary */
+	new_wnd = bytes_to_block(sb, bitmap_size(new_bits));
+	new_last = new_bits & (wbits - 1);
+	if (!new_last)
+		new_last = wbits;
+
+	if (new_wnd != wnd->nwnd) {
+		if (new_wnd <= ARRAY_SIZE(wnd->free_holder)) {
+			new_free = wnd->free_holder;
+		} else {
+			new_free = ntfs_malloc(new_wnd * sizeof(u16));
+			if (!new_free)
+				return -ENOMEM;
+		}
+
+		if (new_free != wnd->free_bits)
+			memcpy(new_free, wnd->free_bits,
+			       wnd->nwnd * sizeof(short));
+		memset(new_free + wnd->nwnd, 0,
+		       (new_wnd - wnd->nwnd) * sizeof(short));
+		if (wnd->free_bits != wnd->free_holder)
+			ntfs_free(wnd->free_bits);
+
+		wnd->free_bits = new_free;
+	}
+
+	/* Zero bits [old_bits,new_bits) */
+	bits = new_bits - old_bits;
+	b0 = old_bits & (wbits - 1);
+
+	for (iw = old_bits >> (sb->s_blocksize_bits + 3); bits; iw += 1) {
+		u32 op;
+		size_t frb;
+		u64 vbo, lbo, bytes;
+		struct buffer_head *bh;
+		ulong *buf;
+
+		if (iw + 1 == new_wnd)
+			wbits = new_last;
+
+		op = b0 + bits > wbits ? wbits - b0 : bits;
+		vbo = (u64)iw * blocksize;
+
+		err = ntfs_vbo_to_lbo(sbi, &wnd->run, vbo, &lbo, &bytes);
+		if (err)
+			break;
+
+		bh = ntfs_bread(sb, lbo >> sb->s_blocksize_bits);
+		if (!bh)
+			return -EIO;
+
+		lock_buffer(bh);
+		buf = (ulong *)bh->b_data;
+
+		__bitmap_clear(buf, b0, blocksize * 8 - b0);
+		frb = wbits - __bitmap_weight(buf, wbits);
+		wnd->total_zeroes += frb - wnd->free_bits[iw];
+		wnd->free_bits[iw] = frb;
+
+		set_buffer_uptodate(bh);
+		mark_buffer_dirty(bh);
+		unlock_buffer(bh);
+		/*err = sync_dirty_buffer(bh);*/
+
+		b0 = 0;
+		bits -= op;
+	}
+
+	wnd->nbits = new_bits;
+	wnd->nwnd = new_wnd;
+	wnd->bits_last = new_last;
+
+	wnd_add_free_ext(wnd, old_bits, new_bits - old_bits, false);
+
+	return 0;
+}
+
+/*
+ * wnd_zone_set
+ */
+void wnd_zone_set(struct wnd_bitmap *wnd, size_t lcn, size_t len)
+{
+	size_t zlen;
+
+	zlen = wnd->zone_end - wnd->zone_bit;
+	if (zlen)
+		wnd_add_free_ext(wnd, wnd->zone_bit, zlen, false);
+
+	if (!RB_EMPTY_ROOT(&wnd->start_tree) && len)
+		wnd_remove_free_ext(wnd, lcn, len);
+
+	wnd->zone_bit = lcn;
+	wnd->zone_end = lcn + len;
+}
+
+int ntfs_trim_fs(struct ntfs_sb_info *sbi, struct fstrim_range *range)
+{
+	int err = 0;
+	struct super_block *sb = sbi->sb;
+	struct wnd_bitmap *wnd = &sbi->used.bitmap;
+	u32 wbits = 8 * sb->s_blocksize;
+	CLST len = 0, lcn = 0, done = 0;
+	CLST minlen = bytes_to_cluster(sbi, range->minlen);
+	CLST lcn_from = bytes_to_cluster(sbi, range->start);
+	size_t iw = lcn_from >> (sb->s_blocksize_bits + 3);
+	u32 wbit = lcn_from & (wbits - 1);
+	const ulong *buf;
+	CLST lcn_to;
+
+	if (!minlen)
+		minlen = 1;
+
+	if (range->len == (u64)-1)
+		lcn_to = wnd->nbits;
+	else
+		lcn_to = bytes_to_cluster(sbi, range->start + range->len);
+
+	down_read_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+
+	for (; iw < wnd->nbits; iw++, wbit = 0) {
+		CLST lcn_wnd = iw * wbits;
+		struct buffer_head *bh;
+
+		if (lcn_wnd > lcn_to)
+			break;
+
+		if (!wnd->free_bits[iw])
+			continue;
+
+		if (iw + 1 == wnd->nwnd)
+			wbits = wnd->bits_last;
+
+		if (lcn_wnd + wbits > lcn_to)
+			wbits = lcn_to - lcn_wnd;
+
+		bh = wnd_map(wnd, iw);
+		if (IS_ERR(bh)) {
+			err = PTR_ERR(bh);
+			break;
+		}
+
+		buf = (ulong *)bh->b_data;
+
+		for (; wbit < wbits; wbit++) {
+			if (!test_bit(wbit, buf)) {
+				if (!len)
+					lcn = lcn_wnd + wbit;
+				len += 1;
+				continue;
+			}
+			if (len >= minlen) {
+				err = ntfs_discard(sbi, lcn, len);
+				if (err)
+					goto out;
+				done += len;
+			}
+			len = 0;
+		}
+		put_bh(bh);
+	}
+
+	/* Process the last fragment */
+	if (len >= minlen) {
+		err = ntfs_discard(sbi, lcn, len);
+		if (err)
+			goto out;
+		done += len;
+	}
+
+out:
+	range->len = (u64)done << sbi->cluster_bits;
+
+	up_read(&wnd->rw_lock);
+
+	return err;
+}

From patchwork Thu Jan 28 09:04:50 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053211
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 1571AC433DB
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:41 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 928C164DD1
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:40 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232005AbhA1JIK (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:10 -0500
Received: from relayfre-01.paragon-software.com ([176.12.100.13]:47856 "EHLO
        relayfre-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231819AbhA1JGP (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:15 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relayfre-01.paragon-software.com (Postfix) with ESMTPS id
 39F661F7F;
        Thu, 28 Jan 2021 12:05:04 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824704;
        bh=nApD0MTUPm9F7X0wj1p0Lp4uizCUF/vwXbfNSdBLhhQ=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=P3Sg7F8hfmgGN5Bv3FeWjtbCB2wp/4MQsu3lmH0OULMYwgKfvy3q6ljYvRV3+TgLG
         UWlEpK+8us/Wil3KElWENDiZGMSUd7jAzSnRzAA1HQJ17Vcv4rgSa63WLv0SvRzylt
         4cObr5f4f8AU214EOGLzZL4pwBpvFyDyzTImbdx8=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:03 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 05/10] fs/ntfs3: Add attrib operations
Date: Thu, 28 Jan 2021 12:04:50 +0300
Message-ID: 
 <20210128090455.3576502-6-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds attrib operations

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/ntfs3/attrib.c   | 2085 +++++++++++++++++++++++++++++++++++++++++++
 fs/ntfs3/attrlist.c |  457 ++++++++++
 fs/ntfs3/xattr.c    | 1072 ++++++++++++++++++++++
 3 files changed, 3614 insertions(+)
 create mode 100644 fs/ntfs3/attrib.c
 create mode 100644 fs/ntfs3/attrlist.c
 create mode 100644 fs/ntfs3/xattr.c

diff --git a/fs/ntfs3/attrib.c b/fs/ntfs3/attrib.c
new file mode 100644
index 000000000000..79360d94498d
--- /dev/null
+++ b/fs/ntfs3/attrib.c
@@ -0,0 +1,2085 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ * TODO: merge attr_set_size/attr_data_get_block/attr_allocate_frame?
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/hash.h>
+#include <linux/nls.h>
+#include <linux/ratelimit.h>
+#include <linux/slab.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * You can set external NTFS_MIN_LOG2_OF_CLUMP/NTFS_MAX_LOG2_OF_CLUMP to manage
+ * preallocate algorithm
+ */
+#ifndef NTFS_MIN_LOG2_OF_CLUMP
+#define NTFS_MIN_LOG2_OF_CLUMP 16
+#endif
+
+#ifndef NTFS_MAX_LOG2_OF_CLUMP
+#define NTFS_MAX_LOG2_OF_CLUMP 26
+#endif
+
+// 16M
+#define NTFS_CLUMP_MIN (1 << (NTFS_MIN_LOG2_OF_CLUMP + 8))
+// 16G
+#define NTFS_CLUMP_MAX (1ull << (NTFS_MAX_LOG2_OF_CLUMP + 8))
+
+/*
+ * get_pre_allocated
+ *
+ */
+static inline u64 get_pre_allocated(u64 size)
+{
+	u32 clump;
+	u8 align_shift;
+	u64 ret;
+
+	if (size <= NTFS_CLUMP_MIN) {
+		clump = 1 << NTFS_MIN_LOG2_OF_CLUMP;
+		align_shift = NTFS_MIN_LOG2_OF_CLUMP;
+	} else if (size >= NTFS_CLUMP_MAX) {
+		clump = 1 << NTFS_MAX_LOG2_OF_CLUMP;
+		align_shift = NTFS_MAX_LOG2_OF_CLUMP;
+	} else {
+		align_shift = NTFS_MIN_LOG2_OF_CLUMP - 1 +
+			      __ffs(size >> (8 + NTFS_MIN_LOG2_OF_CLUMP));
+		clump = 1u << align_shift;
+	}
+
+	ret = (((size + clump - 1) >> align_shift)) << align_shift;
+
+	return ret;
+}
+
+/*
+ * attr_must_be_resident
+ *
+ * returns true if attribute must be resident
+ */
+static inline bool attr_must_be_resident(struct ntfs_sb_info *sbi,
+					 enum ATTR_TYPE type)
+{
+	const struct ATTR_DEF_ENTRY *de;
+
+	switch (type) {
+	case ATTR_STD:
+	case ATTR_NAME:
+	case ATTR_ID:
+	case ATTR_LABEL:
+	case ATTR_VOL_INFO:
+	case ATTR_ROOT:
+	case ATTR_EA_INFO:
+		return true;
+	default:
+		de = ntfs_query_def(sbi, type);
+		if (de && (de->flags & NTFS_ATTR_MUST_BE_RESIDENT))
+			return true;
+		return false;
+	}
+}
+
+/*
+ * attr_load_runs
+ *
+ * load all runs stored in 'attr'
+ */
+int attr_load_runs(struct ATTRIB *attr, struct ntfs_inode *ni,
+		   struct runs_tree *run, const CLST *vcn)
+{
+	int err;
+	CLST svcn = le64_to_cpu(attr->nres.svcn);
+	CLST evcn = le64_to_cpu(attr->nres.evcn);
+	u32 asize;
+	u16 run_off;
+
+	if (svcn >= evcn + 1 || run_is_mapped_full(run, svcn, evcn))
+		return 0;
+
+	if (vcn && (evcn < *vcn || *vcn < svcn))
+		return -EINVAL;
+
+	asize = le32_to_cpu(attr->size);
+	run_off = le16_to_cpu(attr->nres.run_off);
+	err = run_unpack_ex(run, ni->mi.sbi, ni->mi.rno, svcn, evcn,
+			    vcn ? *vcn : svcn, Add2Ptr(attr, run_off),
+			    asize - run_off);
+	if (err < 0)
+		return err;
+
+	return 0;
+}
+
+/*
+ * int run_deallocate_ex
+ *
+ * Deallocate clusters
+ */
+static int run_deallocate_ex(struct ntfs_sb_info *sbi, struct runs_tree *run,
+			     CLST vcn, CLST len, CLST *done, bool trim)
+{
+	int err = 0;
+	CLST vcn_next, vcn0 = vcn, lcn, clen, dn = 0;
+	size_t idx;
+
+	if (!len)
+		goto out;
+
+	if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+failed:
+		run_truncate(run, vcn0);
+		err = -EINVAL;
+		goto out;
+	}
+
+	for (;;) {
+		if (clen > len)
+			clen = len;
+
+		if (!clen) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (lcn != SPARSE_LCN) {
+			mark_as_free_ex(sbi, lcn, clen, trim);
+			dn += clen;
+		}
+
+		len -= clen;
+		if (!len)
+			break;
+
+		vcn_next = vcn + clen;
+		if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+		    vcn != vcn_next) {
+			// save memory - don't load entire run
+			goto failed;
+		}
+	}
+
+out:
+	if (done)
+		*done += dn;
+
+	return err;
+}
+
+/*
+ * attr_allocate_clusters
+ *
+ * find free space, mark it as used and store in 'run'
+ */
+int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run,
+			   CLST vcn, CLST lcn, CLST len, CLST *pre_alloc,
+			   enum ALLOCATE_OPT opt, CLST *alen, const size_t fr,
+			   CLST *new_lcn)
+{
+	int err;
+	CLST flen, vcn0 = vcn, pre = pre_alloc ? *pre_alloc : 0;
+	struct wnd_bitmap *wnd = &sbi->used.bitmap;
+	size_t cnt = run->count;
+
+	for (;;) {
+		err = ntfs_look_for_free_space(sbi, lcn, len + pre, &lcn, &flen,
+					       opt);
+
+		if (err == -ENOSPC && pre) {
+			pre = 0;
+			if (*pre_alloc)
+				*pre_alloc = 0;
+			continue;
+		}
+
+		if (err)
+			goto out;
+
+		if (new_lcn && vcn == vcn0)
+			*new_lcn = lcn;
+
+		/* Add new fragment into run storage */
+		if (!run_add_entry(run, vcn, lcn, flen, opt == ALLOCATE_MFT)) {
+			down_write_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS);
+			wnd_set_free(wnd, lcn, flen);
+			up_write(&wnd->rw_lock);
+			err = -ENOMEM;
+			goto out;
+		}
+
+		vcn += flen;
+
+		if (flen >= len || opt == ALLOCATE_MFT ||
+		    (fr && run->count - cnt >= fr)) {
+			*alen = vcn - vcn0;
+			return 0;
+		}
+
+		len -= flen;
+	}
+
+out:
+	/* undo */
+	run_deallocate_ex(sbi, run, vcn0, vcn - vcn0, NULL, false);
+	run_truncate(run, vcn0);
+
+	return err;
+}
+
+/*
+ * if page is not NULL - it is already contains resident data
+ * and locked (called from ni_write_frame)
+ */
+int attr_make_nonresident(struct ntfs_inode *ni, struct ATTRIB *attr,
+			  struct ATTR_LIST_ENTRY *le, struct mft_inode *mi,
+			  u64 new_size, struct runs_tree *run,
+			  struct ATTRIB **ins_attr, struct page *page)
+{
+	struct ntfs_sb_info *sbi;
+	struct ATTRIB *attr_s;
+	struct MFT_REC *rec;
+	u32 used, asize, rsize, aoff, align;
+	bool is_data;
+	CLST len, alen;
+	char *next;
+	int err;
+
+	if (attr->non_res) {
+		*ins_attr = attr;
+		return 0;
+	}
+
+	sbi = mi->sbi;
+	rec = mi->mrec;
+	attr_s = NULL;
+	used = le32_to_cpu(rec->used);
+	asize = le32_to_cpu(attr->size);
+	next = Add2Ptr(attr, asize);
+	aoff = PtrOffset(rec, attr);
+	rsize = le32_to_cpu(attr->res.data_size);
+	is_data = attr->type == ATTR_DATA && !attr->name_len;
+
+	align = sbi->cluster_size;
+	if (is_attr_compressed(attr))
+		align <<= COMPRESSION_UNIT;
+	len = (rsize + align - 1) >> sbi->cluster_bits;
+
+	run_init(run);
+
+	/* make a copy of original attribute */
+	attr_s = ntfs_memdup(attr, asize);
+	if (!attr_s) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	if (!len) {
+		/* empty resident -> empty nonresident */
+		alen = 0;
+	} else {
+		const char *data = resident_data(attr);
+
+		err = attr_allocate_clusters(sbi, run, 0, 0, len, NULL,
+					     ALLOCATE_DEF, &alen, 0, NULL);
+		if (err)
+			goto out1;
+
+		if (!rsize) {
+			/* empty resident -> non empty nonresident */
+		} else if (!is_data) {
+			err = ntfs_sb_write_run(sbi, run, 0, data, rsize);
+			if (err)
+				goto out2;
+		} else if (!page) {
+			char *kaddr;
+
+			page = grab_cache_page(ni->vfs_inode.i_mapping, 0);
+			if (!page) {
+				err = -ENOMEM;
+				goto out2;
+			}
+			kaddr = kmap_atomic(page);
+			memcpy(kaddr, data, rsize);
+			memset(kaddr + rsize, 0, PAGE_SIZE - rsize);
+			kunmap_atomic(kaddr);
+			flush_dcache_page(page);
+			SetPageUptodate(page);
+			set_page_dirty(page);
+			unlock_page(page);
+			put_page(page);
+		}
+	}
+
+	/* remove original attribute */
+	used -= asize;
+	memmove(attr, Add2Ptr(attr, asize), used - aoff);
+	rec->used = cpu_to_le32(used);
+	mi->dirty = true;
+	if (le)
+		al_remove_le(ni, le);
+
+	err = ni_insert_nonresident(ni, attr_s->type, attr_name(attr_s),
+				    attr_s->name_len, run, 0, alen,
+				    attr_s->flags, &attr, NULL);
+	if (err)
+		goto out3;
+
+	ntfs_free(attr_s);
+	attr->nres.data_size = cpu_to_le64(rsize);
+	attr->nres.valid_size = attr->nres.data_size;
+
+	*ins_attr = attr;
+
+	if (is_data)
+		ni->ni_flags &= ~NI_FLAG_RESIDENT;
+
+	/* Resident attribute becomes non resident */
+	return 0;
+
+out3:
+	attr = Add2Ptr(rec, aoff);
+	memmove(next, attr, used - aoff);
+	memcpy(attr, attr_s, asize);
+	rec->used = cpu_to_le32(used + asize);
+	mi->dirty = true;
+out2:
+	/* undo: do not trim new allocated clusters */
+	run_deallocate(sbi, run, false);
+	run_close(run);
+out1:
+	ntfs_free(attr_s);
+	/*reinsert le*/
+out:
+	return err;
+}
+
+/*
+ * attr_set_size_res
+ *
+ * helper for attr_set_size
+ */
+static int attr_set_size_res(struct ntfs_inode *ni, struct ATTRIB *attr,
+			     struct ATTR_LIST_ENTRY *le, struct mft_inode *mi,
+			     u64 new_size, struct runs_tree *run,
+			     struct ATTRIB **ins_attr)
+{
+	struct ntfs_sb_info *sbi = mi->sbi;
+	struct MFT_REC *rec = mi->mrec;
+	u32 used = le32_to_cpu(rec->used);
+	u32 asize = le32_to_cpu(attr->size);
+	u32 aoff = PtrOffset(rec, attr);
+	u32 rsize = le32_to_cpu(attr->res.data_size);
+	u32 tail = used - aoff - asize;
+	char *next = Add2Ptr(attr, asize);
+	s64 dsize = QuadAlign(new_size) - QuadAlign(rsize);
+
+	if (dsize < 0) {
+		memmove(next + dsize, next, tail);
+	} else if (dsize > 0) {
+		if (used + dsize > sbi->max_bytes_per_attr)
+			return attr_make_nonresident(ni, attr, le, mi, new_size,
+						     run, ins_attr, NULL);
+
+		memmove(next + dsize, next, tail);
+		memset(next, 0, dsize);
+	}
+
+	if (new_size > rsize)
+		memset(Add2Ptr(resident_data(attr), rsize), 0,
+		       new_size - rsize);
+
+	rec->used = cpu_to_le32(used + dsize);
+	attr->size = cpu_to_le32(asize + dsize);
+	attr->res.data_size = cpu_to_le32(new_size);
+	mi->dirty = true;
+	*ins_attr = attr;
+
+	return 0;
+}
+
+/*
+ * attr_set_size
+ *
+ * change the size of attribute
+ * Extend:
+ *   - sparse/compressed: no allocated clusters
+ *   - normal: append allocated and preallocated new clusters
+ * Shrink:
+ *   - no deallocate if keep_prealloc is set
+ */
+int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type,
+		  const __le16 *name, u8 name_len, struct runs_tree *run,
+		  u64 new_size, const u64 *new_valid, bool keep_prealloc,
+		  struct ATTRIB **ret)
+{
+	int err = 0;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	u8 cluster_bits = sbi->cluster_bits;
+	bool is_mft =
+		ni->mi.rno == MFT_REC_MFT && type == ATTR_DATA && !name_len;
+	u64 old_valid, old_size, old_alloc, new_alloc, new_alloc_tmp;
+	struct ATTRIB *attr = NULL, *attr_b;
+	struct ATTR_LIST_ENTRY *le, *le_b;
+	struct mft_inode *mi, *mi_b;
+	CLST alen, vcn, lcn, new_alen, old_alen, svcn, evcn;
+	CLST next_svcn, pre_alloc = -1, done = 0;
+	bool is_ext;
+	u32 align;
+	struct MFT_REC *rec;
+
+again:
+	le_b = NULL;
+	attr_b = ni_find_attr(ni, NULL, &le_b, type, name, name_len, NULL,
+			      &mi_b);
+	if (!attr_b) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	if (!attr_b->non_res) {
+		err = attr_set_size_res(ni, attr_b, le_b, mi_b, new_size, run,
+					&attr_b);
+		if (err || !attr_b->non_res)
+			goto out;
+
+		/* layout of records may be changed, so do a full search */
+		goto again;
+	}
+
+	is_ext = is_attr_ext(attr_b);
+
+again_1:
+	if (is_ext) {
+		align = 1u << (attr_b->nres.c_unit + cluster_bits);
+		if (is_attr_sparsed(attr_b))
+			keep_prealloc = false;
+	} else {
+		align = sbi->cluster_size;
+	}
+
+	old_valid = le64_to_cpu(attr_b->nres.valid_size);
+	old_size = le64_to_cpu(attr_b->nres.data_size);
+	old_alloc = le64_to_cpu(attr_b->nres.alloc_size);
+	old_alen = old_alloc >> cluster_bits;
+
+	new_alloc = (new_size + align - 1) & ~(u64)(align - 1);
+	new_alen = new_alloc >> cluster_bits;
+
+	if (keep_prealloc && is_ext)
+		keep_prealloc = false;
+
+	if (keep_prealloc && new_size < old_size) {
+		attr_b->nres.data_size = cpu_to_le64(new_size);
+		mi_b->dirty = true;
+		goto ok;
+	}
+
+	vcn = old_alen - 1;
+
+	svcn = le64_to_cpu(attr_b->nres.svcn);
+	evcn = le64_to_cpu(attr_b->nres.evcn);
+
+	if (svcn <= vcn && vcn <= evcn) {
+		attr = attr_b;
+		le = le_b;
+		mi = mi_b;
+	} else if (!le_b) {
+		err = -EINVAL;
+		goto out;
+	} else {
+		le = le_b;
+		attr = ni_find_attr(ni, attr_b, &le, type, name, name_len, &vcn,
+				    &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+
+next_le_1:
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn = le64_to_cpu(attr->nres.evcn);
+	}
+
+next_le:
+	rec = mi->mrec;
+
+	err = attr_load_runs(attr, ni, run, NULL);
+	if (err)
+		goto out;
+
+	if (new_size > old_size) {
+		CLST to_allocate;
+		size_t free;
+
+		if (new_alloc <= old_alloc) {
+			attr_b->nres.data_size = cpu_to_le64(new_size);
+			mi_b->dirty = true;
+			goto ok;
+		}
+
+		to_allocate = new_alen - old_alen;
+add_alloc_in_same_attr_seg:
+		lcn = 0;
+		if (is_mft) {
+			/* mft allocates clusters from mftzone */
+			pre_alloc = 0;
+		} else if (is_ext) {
+			/* no preallocate for sparse/compress */
+			pre_alloc = 0;
+		} else if (pre_alloc == -1) {
+			pre_alloc = 0;
+			if (type == ATTR_DATA && !name_len &&
+			    sbi->options.prealloc) {
+				CLST new_alen2 = bytes_to_cluster(
+					sbi, get_pre_allocated(new_size));
+				pre_alloc = new_alen2 - new_alen;
+			}
+
+			/* Get the last lcn to allocate from */
+			if (old_alen &&
+			    !run_lookup_entry(run, vcn, &lcn, NULL, NULL)) {
+				lcn = SPARSE_LCN;
+			}
+
+			if (lcn == SPARSE_LCN)
+				lcn = 0;
+			else if (lcn)
+				lcn += 1;
+
+			free = wnd_zeroes(&sbi->used.bitmap);
+			if (to_allocate > free) {
+				err = -ENOSPC;
+				goto out;
+			}
+
+			if (pre_alloc && to_allocate + pre_alloc > free)
+				pre_alloc = 0;
+		}
+
+		vcn = old_alen;
+
+		if (is_ext) {
+			if (!run_add_entry(run, vcn, SPARSE_LCN, to_allocate,
+					   false)) {
+				err = -ENOMEM;
+				goto out;
+			}
+			alen = to_allocate;
+		} else {
+			/* ~3 bytes per fragment */
+			err = attr_allocate_clusters(
+				sbi, run, vcn, lcn, to_allocate, &pre_alloc,
+				is_mft ? ALLOCATE_MFT : 0, &alen,
+				is_mft ? 0 :
+					 (sbi->record_size -
+					  le32_to_cpu(rec->used) + 8) /
+							 3 +
+						 1,
+				NULL);
+			if (err)
+				goto out;
+		}
+
+		done += alen;
+		vcn += alen;
+		if (to_allocate > alen)
+			to_allocate -= alen;
+		else
+			to_allocate = 0;
+
+pack_runs:
+		err = mi_pack_runs(mi, attr, run, vcn - svcn);
+		if (err)
+			goto out;
+
+		next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+		new_alloc_tmp = (u64)next_svcn << cluster_bits;
+		attr_b->nres.alloc_size = cpu_to_le64(new_alloc_tmp);
+		mi_b->dirty = true;
+
+		if (next_svcn >= vcn && !to_allocate) {
+			/* Normal way. update attribute and exit */
+			attr_b->nres.data_size = cpu_to_le64(new_size);
+			goto ok;
+		}
+
+		/* at least two mft to avoid recursive loop*/
+		if (is_mft && next_svcn == vcn &&
+		    ((u64)done << sbi->cluster_bits) >= 2 * sbi->record_size) {
+			new_size = new_alloc_tmp;
+			attr_b->nres.data_size = attr_b->nres.alloc_size;
+			goto ok;
+		}
+
+		if (le32_to_cpu(rec->used) < sbi->record_size) {
+			old_alen = next_svcn;
+			evcn = old_alen - 1;
+			goto add_alloc_in_same_attr_seg;
+		}
+
+		attr_b->nres.data_size = attr_b->nres.alloc_size;
+		if (new_alloc_tmp < old_valid)
+			attr_b->nres.valid_size = attr_b->nres.data_size;
+
+		if (type == ATTR_LIST) {
+			err = ni_expand_list(ni);
+			if (err)
+				goto out;
+			if (next_svcn < vcn)
+				goto pack_runs;
+
+			/* layout of records is changed */
+			goto again;
+		}
+
+		if (!ni->attr_list.size) {
+			err = ni_create_attr_list(ni);
+			if (err)
+				goto out;
+			/* layout of records is changed */
+		}
+
+		if (next_svcn >= vcn) {
+			/* this is mft data, repeat */
+			goto again;
+		}
+
+		/* insert new attribute segment */
+		err = ni_insert_nonresident(ni, type, name, name_len, run,
+					    next_svcn, vcn - next_svcn,
+					    attr_b->flags, &attr, &mi);
+		if (err)
+			goto out;
+
+		if (!is_mft)
+			run_truncate_head(run, evcn + 1);
+
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn = le64_to_cpu(attr->nres.evcn);
+
+		le_b = NULL;
+		/* layout of records maybe changed */
+		/* find base attribute to update*/
+		attr_b = ni_find_attr(ni, NULL, &le_b, type, name, name_len,
+				      NULL, &mi_b);
+		if (!attr_b) {
+			err = -ENOENT;
+			goto out;
+		}
+
+		attr_b->nres.alloc_size = cpu_to_le64((u64)vcn << cluster_bits);
+		attr_b->nres.data_size = attr_b->nres.alloc_size;
+		attr_b->nres.valid_size = attr_b->nres.alloc_size;
+		mi_b->dirty = true;
+		goto again_1;
+	}
+
+	if (new_size != old_size ||
+	    (new_alloc != old_alloc && !keep_prealloc)) {
+		vcn = max(svcn, new_alen);
+		new_alloc_tmp = (u64)vcn << cluster_bits;
+
+		alen = 0;
+		err = run_deallocate_ex(sbi, run, vcn, evcn - vcn + 1, &alen,
+					true);
+		if (err)
+			goto out;
+
+		run_truncate(run, vcn);
+
+		if (vcn > svcn) {
+			err = mi_pack_runs(mi, attr, run, vcn - svcn);
+			if (err)
+				goto out;
+		} else if (le && le->vcn) {
+			u16 le_sz = le16_to_cpu(le->size);
+
+			/*
+			 * NOTE: list entries for one attribute are always
+			 * the same size. We deal with last entry (vcn==0)
+			 * and it is not first in entries array
+			 * (list entry for std attribute always first)
+			 * So it is safe to step back
+			 */
+			mi_remove_attr(mi, attr);
+
+			if (!al_remove_le(ni, le)) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			le = (struct ATTR_LIST_ENTRY *)((u8 *)le - le_sz);
+		} else {
+			attr->nres.evcn = cpu_to_le64((u64)vcn - 1);
+			mi->dirty = true;
+		}
+
+		attr_b->nres.alloc_size = cpu_to_le64(new_alloc_tmp);
+
+		if (vcn == new_alen) {
+			attr_b->nres.data_size = cpu_to_le64(new_size);
+			if (new_size < old_valid)
+				attr_b->nres.valid_size =
+					attr_b->nres.data_size;
+		} else {
+			if (new_alloc_tmp <=
+			    le64_to_cpu(attr_b->nres.data_size))
+				attr_b->nres.data_size =
+					attr_b->nres.alloc_size;
+			if (new_alloc_tmp <
+			    le64_to_cpu(attr_b->nres.valid_size))
+				attr_b->nres.valid_size =
+					attr_b->nres.alloc_size;
+		}
+
+		if (is_ext)
+			le64_sub_cpu(&attr_b->nres.total_size,
+				     ((u64)alen << cluster_bits));
+
+		mi_b->dirty = true;
+
+		if (new_alloc_tmp <= new_alloc)
+			goto ok;
+
+		old_size = new_alloc_tmp;
+		vcn = svcn - 1;
+
+		if (le == le_b) {
+			attr = attr_b;
+			mi = mi_b;
+			evcn = svcn - 1;
+			svcn = 0;
+			goto next_le;
+		}
+
+		if (le->type != type || le->name_len != name_len ||
+		    memcmp(le_name(le), name, name_len * sizeof(short))) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		err = ni_load_mi(ni, le, &mi);
+		if (err)
+			goto out;
+
+		attr = mi_find_attr(mi, NULL, type, name, name_len, &le->id);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+		goto next_le_1;
+	}
+
+ok:
+	if (new_valid) {
+		__le64 valid = cpu_to_le64(min(*new_valid, new_size));
+
+		if (attr_b->nres.valid_size != valid) {
+			attr_b->nres.valid_size = valid;
+			mi_b->dirty = true;
+		}
+	}
+
+out:
+	if (!err && attr_b && ret)
+		*ret = attr_b;
+
+	/* update inode_set_bytes*/
+	if (!err && ((type == ATTR_DATA && !name_len) ||
+		     (type == ATTR_ALLOC && name == I30_NAME))) {
+		bool dirty = false;
+
+		if (ni->vfs_inode.i_size != new_size) {
+			ni->vfs_inode.i_size = new_size;
+			dirty = true;
+		}
+
+		if (attr_b && attr_b->non_res) {
+			new_alloc = le64_to_cpu(attr_b->nres.alloc_size);
+			if (inode_get_bytes(&ni->vfs_inode) != new_alloc) {
+				inode_set_bytes(&ni->vfs_inode, new_alloc);
+				dirty = true;
+			}
+		}
+
+		if (dirty) {
+			ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+			mark_inode_dirty(&ni->vfs_inode);
+		}
+	}
+
+	return err;
+}
+
+int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn,
+			CLST *len, bool *new)
+{
+	int err = 0;
+	struct runs_tree *run = &ni->file.run;
+	struct ntfs_sb_info *sbi;
+	u8 cluster_bits;
+	struct ATTRIB *attr = NULL, *attr_b;
+	struct ATTR_LIST_ENTRY *le, *le_b;
+	struct mft_inode *mi, *mi_b;
+	CLST hint, svcn, to_alloc, evcn1, next_svcn, asize, end;
+	u64 new_size, total_size;
+	u32 clst_per_frame;
+	bool ok;
+
+	if (new)
+		*new = false;
+
+	down_read(&ni->file.run_lock);
+	ok = run_lookup_entry(run, vcn, lcn, len, NULL);
+	up_read(&ni->file.run_lock);
+
+	if (ok && (*lcn != SPARSE_LCN || !new)) {
+		/* normal way */
+		return 0;
+	}
+
+	if (!clen)
+		clen = 1;
+
+	if (ok && clen > *len)
+		clen = *len;
+
+	sbi = ni->mi.sbi;
+	cluster_bits = sbi->cluster_bits;
+	new_size = ((u64)vcn + clen) << cluster_bits;
+
+	ni_lock(ni);
+	down_write(&ni->file.run_lock);
+
+	le_b = NULL;
+	attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+	if (!attr_b) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	if (!attr_b->non_res) {
+		*lcn = RESIDENT_LCN;
+		*len = 1;
+		goto out;
+	}
+
+	asize = le64_to_cpu(attr_b->nres.alloc_size) >> sbi->cluster_bits;
+	if (vcn >= asize) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	clst_per_frame = 1u << attr_b->nres.c_unit;
+	to_alloc = (clen + clst_per_frame - 1) & ~(clst_per_frame - 1);
+
+	if (vcn + to_alloc > asize)
+		to_alloc = asize - vcn;
+
+	svcn = le64_to_cpu(attr_b->nres.svcn);
+	evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+
+	attr = attr_b;
+	le = le_b;
+	mi = mi_b;
+
+	if (le_b && (vcn < svcn || evcn1 <= vcn)) {
+		attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+				    &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+
+	err = attr_load_runs(attr, ni, run, NULL);
+	if (err)
+		goto out;
+
+	if (!ok) {
+		ok = run_lookup_entry(run, vcn, lcn, len, NULL);
+		if (ok && (*lcn != SPARSE_LCN || !new)) {
+			/* normal way */
+			err = 0;
+			goto ok;
+		}
+
+		if (!ok && !new) {
+			*len = 0;
+			err = 0;
+			goto ok;
+		}
+
+		if (ok && clen > *len) {
+			clen = *len;
+			new_size = ((u64)vcn + clen) << cluster_bits;
+			to_alloc = (clen + clst_per_frame - 1) &
+				   ~(clst_per_frame - 1);
+		}
+	}
+
+	if (!is_attr_ext(attr_b)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Get the last lcn to allocate from */
+	hint = 0;
+
+	if (vcn > evcn1) {
+		if (!run_add_entry(run, evcn1, SPARSE_LCN, vcn - evcn1,
+				   false)) {
+			err = -ENOMEM;
+			goto out;
+		}
+	} else if (vcn && !run_lookup_entry(run, vcn - 1, &hint, NULL, NULL)) {
+		hint = -1;
+	}
+
+	err = attr_allocate_clusters(
+		sbi, run, vcn, hint + 1, to_alloc, NULL, 0, len,
+		(sbi->record_size - le32_to_cpu(mi->mrec->used) + 8) / 3 + 1,
+		lcn);
+	if (err)
+		goto out;
+	*new = true;
+
+	end = vcn + *len;
+
+	total_size = le64_to_cpu(attr_b->nres.total_size) +
+		     ((u64)*len << cluster_bits);
+
+repack:
+	err = mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn);
+	if (err)
+		goto out;
+
+	attr_b->nres.total_size = cpu_to_le64(total_size);
+	inode_set_bytes(&ni->vfs_inode, total_size);
+	ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+
+	mi_b->dirty = true;
+	mark_inode_dirty(&ni->vfs_inode);
+
+	/* stored [vcn : next_svcn) from [vcn : end) */
+	next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+
+	if (end <= evcn1) {
+		if (next_svcn == evcn1) {
+			/* Normal way. update attribute and exit */
+			goto ok;
+		}
+		/* add new segment [next_svcn : evcn1 - next_svcn )*/
+		if (!ni->attr_list.size) {
+			err = ni_create_attr_list(ni);
+			if (err)
+				goto out;
+			/* layout of records is changed */
+			le_b = NULL;
+			attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL,
+					      0, NULL, &mi_b);
+			if (!attr_b) {
+				err = -ENOENT;
+				goto out;
+			}
+
+			attr = attr_b;
+			le = le_b;
+			mi = mi_b;
+			goto repack;
+		}
+	}
+
+	svcn = evcn1;
+
+	/* Estimate next attribute */
+	attr = ni_find_attr(ni, attr, &le, ATTR_DATA, NULL, 0, &svcn, &mi);
+
+	if (attr) {
+		CLST alloc = bytes_to_cluster(
+			sbi, le64_to_cpu(attr_b->nres.alloc_size));
+		CLST evcn = le64_to_cpu(attr->nres.evcn);
+
+		if (end < next_svcn)
+			end = next_svcn;
+		while (end > evcn) {
+			/* remove segment [svcn : evcn)*/
+			mi_remove_attr(mi, attr);
+
+			if (!al_remove_le(ni, le)) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			if (evcn + 1 >= alloc) {
+				/* last attribute segment */
+				evcn1 = evcn + 1;
+				goto ins_ext;
+			}
+
+			if (ni_load_mi(ni, le, &mi)) {
+				attr = NULL;
+				goto out;
+			}
+
+			attr = mi_find_attr(mi, NULL, ATTR_DATA, NULL, 0,
+					    &le->id);
+			if (!attr) {
+				err = -EINVAL;
+				goto out;
+			}
+			svcn = le64_to_cpu(attr->nres.svcn);
+			evcn = le64_to_cpu(attr->nres.evcn);
+		}
+
+		if (end < svcn)
+			end = svcn;
+
+		err = attr_load_runs(attr, ni, run, &end);
+		if (err)
+			goto out;
+
+		evcn1 = evcn + 1;
+		attr->nres.svcn = cpu_to_le64(next_svcn);
+		err = mi_pack_runs(mi, attr, run, evcn1 - next_svcn);
+		if (err)
+			goto out;
+
+		le->vcn = cpu_to_le64(next_svcn);
+		ni->attr_list.dirty = true;
+		mi->dirty = true;
+
+		next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+ins_ext:
+	if (evcn1 > next_svcn) {
+		err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run,
+					    next_svcn, evcn1 - next_svcn,
+					    attr_b->flags, &attr, &mi);
+		if (err)
+			goto out;
+	}
+ok:
+	run_truncate_around(run, vcn);
+out:
+	up_write(&ni->file.run_lock);
+	ni_unlock(ni);
+
+	return err;
+}
+
+int attr_data_read_resident(struct ntfs_inode *ni, struct page *page)
+{
+	u64 vbo;
+	struct ATTRIB *attr;
+	u32 data_size;
+
+	attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, NULL);
+	if (!attr)
+		return -EINVAL;
+
+	if (attr->non_res)
+		return E_NTFS_NONRESIDENT;
+
+	vbo = page->index << PAGE_SHIFT;
+	data_size = le32_to_cpu(attr->res.data_size);
+	if (vbo < data_size) {
+		const char *data = resident_data(attr);
+		char *kaddr = kmap_atomic(page);
+		u32 use = data_size - vbo;
+
+		if (use > PAGE_SIZE)
+			use = PAGE_SIZE;
+
+		memcpy(kaddr, data + vbo, use);
+		memset(kaddr + use, 0, PAGE_SIZE - use);
+		kunmap_atomic(kaddr);
+		flush_dcache_page(page);
+		SetPageUptodate(page);
+	} else if (!PageUptodate(page)) {
+		zero_user_segment(page, 0, PAGE_SIZE);
+		SetPageUptodate(page);
+	}
+
+	return 0;
+}
+
+int attr_data_write_resident(struct ntfs_inode *ni, struct page *page)
+{
+	u64 vbo;
+	struct mft_inode *mi;
+	struct ATTRIB *attr;
+	u32 data_size;
+
+	attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, &mi);
+	if (!attr)
+		return -EINVAL;
+
+	if (attr->non_res) {
+		/*return special error code to check this case*/
+		return E_NTFS_NONRESIDENT;
+	}
+
+	vbo = page->index << PAGE_SHIFT;
+	data_size = le32_to_cpu(attr->res.data_size);
+	if (vbo < data_size) {
+		char *data = resident_data(attr);
+		char *kaddr = kmap_atomic(page);
+		u32 use = data_size - vbo;
+
+		if (use > PAGE_SIZE)
+			use = PAGE_SIZE;
+		memcpy(data + vbo, kaddr, use);
+		kunmap_atomic(kaddr);
+		mi->dirty = true;
+	}
+	ni->i_valid = data_size;
+
+	return 0;
+}
+
+/*
+ * attr_load_runs_vcn
+ *
+ * load runs with vcn
+ */
+int attr_load_runs_vcn(struct ntfs_inode *ni, enum ATTR_TYPE type,
+		       const __le16 *name, u8 name_len, struct runs_tree *run,
+		       CLST vcn)
+{
+	struct ATTRIB *attr;
+	int err;
+	CLST svcn, evcn;
+	u16 ro;
+
+	attr = ni_find_attr(ni, NULL, NULL, type, name, name_len, &vcn, NULL);
+	if (!attr)
+		return -ENOENT;
+
+	svcn = le64_to_cpu(attr->nres.svcn);
+	evcn = le64_to_cpu(attr->nres.evcn);
+
+	if (evcn < vcn || vcn < svcn)
+		return -EINVAL;
+
+	ro = le16_to_cpu(attr->nres.run_off);
+	err = run_unpack_ex(run, ni->mi.sbi, ni->mi.rno, svcn, evcn, svcn,
+			    Add2Ptr(attr, ro), le32_to_cpu(attr->size) - ro);
+	if (err < 0)
+		return err;
+	return 0;
+}
+
+/*
+ * load runs for given range [from to)
+ */
+int attr_load_runs_range(struct ntfs_inode *ni, enum ATTR_TYPE type,
+			 const __le16 *name, u8 name_len, struct runs_tree *run,
+			 u64 from, u64 to)
+{
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	u8 cluster_bits = sbi->cluster_bits;
+	CLST vcn = from >> cluster_bits;
+	CLST vcn_last = (to - 1) >> cluster_bits;
+	CLST lcn, clen;
+	int err;
+
+	for (vcn = from >> cluster_bits; vcn <= vcn_last; vcn += clen) {
+		if (!run_lookup_entry(run, vcn, &lcn, &clen, NULL)) {
+			err = attr_load_runs_vcn(ni, type, name, name_len, run,
+						 vcn);
+			if (err)
+				return err;
+			clen = 0; /*next run_lookup_entry(vcn) must be success*/
+		}
+	}
+
+	return 0;
+}
+
+#ifdef CONFIG_NTFS3_LZX_XPRESS
+/*
+ * attr_wof_frame_info
+ *
+ * read header of xpress/lzx file to get info about frame
+ */
+int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr,
+			struct runs_tree *run, u64 frame, u64 frames,
+			u8 frame_bits, u32 *ondisk_size, u64 *vbo_data)
+{
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	u64 vbo[2], off[2], wof_size;
+	u32 voff;
+	u8 bytes_per_off;
+	char *addr;
+	struct page *page;
+	int i, err;
+	__le32 *off32;
+	__le64 *off64;
+
+	if (ni->vfs_inode.i_size < 0x100000000ull) {
+		/* file starts with array of 32 bit offsets */
+		bytes_per_off = sizeof(__le32);
+		vbo[1] = frame << 2;
+		*vbo_data = frames << 2;
+	} else {
+		/* file starts with array of 64 bit offsets */
+		bytes_per_off = sizeof(__le64);
+		vbo[1] = frame << 3;
+		*vbo_data = frames << 3;
+	}
+
+	/*
+	 * read 4/8 bytes at [vbo - 4(8)] == offset where compressed frame starts
+	 * read 4/8 bytes at [vbo] == offset where compressed frame ends
+	 */
+	if (!attr->non_res) {
+		if (vbo[1] + bytes_per_off > le32_to_cpu(attr->res.data_size)) {
+			ntfs_inode_err(&ni->vfs_inode, "is corrupted");
+			return -EINVAL;
+		}
+		addr = resident_data(attr);
+
+		if (bytes_per_off == sizeof(__le32)) {
+			off32 = Add2Ptr(addr, vbo[1]);
+			off[0] = vbo[1] ? le32_to_cpu(off32[-1]) : 0;
+			off[1] = le32_to_cpu(off32[0]);
+		} else {
+			off64 = Add2Ptr(addr, vbo[1]);
+			off[0] = vbo[1] ? le64_to_cpu(off64[-1]) : 0;
+			off[1] = le64_to_cpu(off64[0]);
+		}
+
+		*vbo_data += off[0];
+		*ondisk_size = off[1] - off[0];
+		return 0;
+	}
+
+	wof_size = le64_to_cpu(attr->nres.data_size);
+	down_write(&ni->file.run_lock);
+	page = ni->file.offs_page;
+	if (!page) {
+		page = alloc_page(GFP_KERNEL);
+		if (!page) {
+			err = -ENOMEM;
+			goto out;
+		}
+		page->index = -1;
+		ni->file.offs_page = page;
+	}
+	lock_page(page);
+	addr = page_address(page);
+
+	if (vbo[1]) {
+		voff = vbo[1] & (PAGE_SIZE - 1);
+		vbo[0] = vbo[1] - bytes_per_off;
+		i = 0;
+	} else {
+		voff = 0;
+		vbo[0] = 0;
+		off[0] = 0;
+		i = 1;
+	}
+
+	do {
+		pgoff_t index = vbo[i] >> PAGE_SHIFT;
+
+		if (index != page->index) {
+			u64 from = vbo[i] & ~(u64)(PAGE_SIZE - 1);
+			u64 to = min(from + PAGE_SIZE, wof_size);
+
+			err = attr_load_runs_range(ni, ATTR_DATA, WOF_NAME,
+						   ARRAY_SIZE(WOF_NAME), run,
+						   from, to);
+			if (err)
+				goto out1;
+
+			err = ntfs_bio_pages(sbi, run, &page, 1, from,
+					     to - from, REQ_OP_READ);
+			if (err) {
+				page->index = -1;
+				goto out1;
+			}
+			page->index = index;
+		}
+
+		if (i) {
+			if (bytes_per_off == sizeof(__le32)) {
+				off32 = Add2Ptr(addr, voff);
+				off[1] = le32_to_cpu(*off32);
+			} else {
+				off64 = Add2Ptr(addr, voff);
+				off[1] = le64_to_cpu(*off64);
+			}
+		} else if (!voff) {
+			if (bytes_per_off == sizeof(__le32)) {
+				off32 = Add2Ptr(addr, PAGE_SIZE - sizeof(u32));
+				off[0] = le32_to_cpu(*off32);
+			} else {
+				off64 = Add2Ptr(addr, PAGE_SIZE - sizeof(u64));
+				off[0] = le64_to_cpu(*off64);
+			}
+		} else {
+			/* two values in one page*/
+			if (bytes_per_off == sizeof(__le32)) {
+				off32 = Add2Ptr(addr, voff);
+				off[0] = le32_to_cpu(off32[-1]);
+				off[1] = le32_to_cpu(off32[0]);
+			} else {
+				off64 = Add2Ptr(addr, voff);
+				off[0] = le64_to_cpu(off64[-1]);
+				off[1] = le64_to_cpu(off64[0]);
+			}
+			break;
+		}
+	} while (++i < 2);
+
+	*vbo_data += off[0];
+	*ondisk_size = off[1] - off[0];
+
+out1:
+	unlock_page(page);
+out:
+	up_write(&ni->file.run_lock);
+	return err;
+}
+#endif
+
+/*
+ * attr_is_frame_compressed
+ *
+ * This function is used to detect compressed frame
+ */
+int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr,
+			     CLST frame, CLST *clst_data)
+{
+	int err;
+	u32 clst_frame;
+	CLST clen, lcn, vcn, alen, slen, vcn_next;
+	size_t idx;
+	struct runs_tree *run;
+
+	*clst_data = 0;
+
+	if (!is_attr_compressed(attr))
+		return 0;
+
+	if (!attr->non_res)
+		return 0;
+
+	clst_frame = 1u << attr->nres.c_unit;
+	vcn = frame * clst_frame;
+	run = &ni->file.run;
+
+	if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) {
+		err = attr_load_runs_vcn(ni, attr->type, attr_name(attr),
+					 attr->name_len, run, vcn);
+		if (err)
+			return err;
+
+		if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx))
+			return -EINVAL;
+	}
+
+	if (lcn == SPARSE_LCN) {
+		/* sparsed frame */
+		return 0;
+	}
+
+	if (clen >= clst_frame) {
+		/*
+		 * The frame is not compressed 'cause
+		 * it does not contain any sparse clusters
+		 */
+		*clst_data = clst_frame;
+		return 0;
+	}
+
+	alen = bytes_to_cluster(ni->mi.sbi, le64_to_cpu(attr->nres.alloc_size));
+	slen = 0;
+	*clst_data = clen;
+
+	/*
+	 * The frame is compressed if *clst_data + slen >= clst_frame
+	 * Check next fragments
+	 */
+	while ((vcn += clen) < alen) {
+		vcn_next = vcn;
+
+		if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) ||
+		    vcn_next != vcn) {
+			err = attr_load_runs_vcn(ni, attr->type,
+						 attr_name(attr),
+						 attr->name_len, run, vcn_next);
+			if (err)
+				return err;
+			vcn = vcn_next;
+
+			if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx))
+				return -EINVAL;
+		}
+
+		if (lcn == SPARSE_LCN) {
+			slen += clen;
+		} else {
+			if (slen) {
+				/*
+				 * data_clusters + sparse_clusters =
+				 * not enough for frame
+				 */
+				return -EINVAL;
+			}
+			*clst_data += clen;
+		}
+
+		if (*clst_data + slen >= clst_frame) {
+			if (!slen) {
+				/*
+				 * There is no sparsed clusters in this frame
+				 * So it is not compressed
+				 */
+				*clst_data = clst_frame;
+			} else {
+				/*frame is compressed*/
+			}
+			break;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * attr_allocate_frame
+ *
+ * allocate/free clusters for 'frame'
+ * assumed: down_write(&ni->file.run_lock);
+ */
+int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size,
+			u64 new_valid)
+{
+	int err = 0;
+	struct runs_tree *run = &ni->file.run;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	struct ATTRIB *attr = NULL, *attr_b;
+	struct ATTR_LIST_ENTRY *le, *le_b;
+	struct mft_inode *mi, *mi_b;
+	CLST svcn, evcn1, next_svcn, lcn, len;
+	CLST vcn, end, clst_data;
+	u64 total_size, valid_size, data_size;
+
+	le_b = NULL;
+	attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+	if (!attr_b)
+		return -ENOENT;
+
+	if (!is_attr_ext(attr_b))
+		return -EINVAL;
+
+	vcn = frame << NTFS_LZNT_CUNIT;
+	total_size = le64_to_cpu(attr_b->nres.total_size);
+
+	svcn = le64_to_cpu(attr_b->nres.svcn);
+	evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+	data_size = le64_to_cpu(attr_b->nres.data_size);
+
+	if (svcn <= vcn && vcn < evcn1) {
+		attr = attr_b;
+		le = le_b;
+		mi = mi_b;
+	} else if (!le_b) {
+		err = -EINVAL;
+		goto out;
+	} else {
+		le = le_b;
+		attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+				    &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+
+	err = attr_load_runs(attr, ni, run, NULL);
+	if (err)
+		goto out;
+
+	err = attr_is_frame_compressed(ni, attr_b, frame, &clst_data);
+	if (err)
+		goto out;
+
+	total_size -= (u64)clst_data << sbi->cluster_bits;
+
+	len = bytes_to_cluster(sbi, compr_size);
+
+	if (len == clst_data)
+		goto out;
+
+	if (len < clst_data) {
+		err = run_deallocate_ex(sbi, run, vcn + len, clst_data - len,
+					NULL, true);
+		if (err)
+			goto out;
+
+		if (!run_add_entry(run, vcn + len, SPARSE_LCN, clst_data - len,
+				   false)) {
+			err = -ENOMEM;
+			goto out;
+		}
+		end = vcn + clst_data;
+		/* run contains updated range [vcn + len : end) */
+	} else {
+		CLST alen, hint = 0;
+		/* Get the last lcn to allocate from */
+		if (vcn + clst_data &&
+		    !run_lookup_entry(run, vcn + clst_data - 1, &hint, NULL,
+				      NULL)) {
+			hint = -1;
+		}
+
+		err = attr_allocate_clusters(sbi, run, vcn + clst_data,
+					     hint + 1, len - clst_data, NULL, 0,
+					     &alen, 0, &lcn);
+		if (err)
+			goto out;
+
+		end = vcn + len;
+		/* run contains updated range [vcn + clst_data : end) */
+	}
+
+	total_size += (u64)len << sbi->cluster_bits;
+
+repack:
+	err = mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn);
+	if (err)
+		goto out;
+
+	attr_b->nres.total_size = cpu_to_le64(total_size);
+	inode_set_bytes(&ni->vfs_inode, total_size);
+
+	mi_b->dirty = true;
+	mark_inode_dirty(&ni->vfs_inode);
+
+	/* stored [vcn : next_svcn) from [vcn : end) */
+	next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+
+	if (end <= evcn1) {
+		if (next_svcn == evcn1) {
+			/* Normal way. update attribute and exit */
+			goto ok;
+		}
+		/* add new segment [next_svcn : evcn1 - next_svcn )*/
+		if (!ni->attr_list.size) {
+			err = ni_create_attr_list(ni);
+			if (err)
+				goto out;
+			/* layout of records is changed */
+			le_b = NULL;
+			attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL,
+					      0, NULL, &mi_b);
+			if (!attr_b) {
+				err = -ENOENT;
+				goto out;
+			}
+
+			attr = attr_b;
+			le = le_b;
+			mi = mi_b;
+			goto repack;
+		}
+	}
+
+	svcn = evcn1;
+
+	/* Estimate next attribute */
+	attr = ni_find_attr(ni, attr, &le, ATTR_DATA, NULL, 0, &svcn, &mi);
+
+	if (attr) {
+		CLST alloc = bytes_to_cluster(
+			sbi, le64_to_cpu(attr_b->nres.alloc_size));
+		CLST evcn = le64_to_cpu(attr->nres.evcn);
+
+		if (end < next_svcn)
+			end = next_svcn;
+		while (end > evcn) {
+			/* remove segment [svcn : evcn)*/
+			mi_remove_attr(mi, attr);
+
+			if (!al_remove_le(ni, le)) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			if (evcn + 1 >= alloc) {
+				/* last attribute segment */
+				evcn1 = evcn + 1;
+				goto ins_ext;
+			}
+
+			if (ni_load_mi(ni, le, &mi)) {
+				attr = NULL;
+				goto out;
+			}
+
+			attr = mi_find_attr(mi, NULL, ATTR_DATA, NULL, 0,
+					    &le->id);
+			if (!attr) {
+				err = -EINVAL;
+				goto out;
+			}
+			svcn = le64_to_cpu(attr->nres.svcn);
+			evcn = le64_to_cpu(attr->nres.evcn);
+		}
+
+		if (end < svcn)
+			end = svcn;
+
+		err = attr_load_runs(attr, ni, run, &end);
+		if (err)
+			goto out;
+
+		evcn1 = evcn + 1;
+		attr->nres.svcn = cpu_to_le64(next_svcn);
+		err = mi_pack_runs(mi, attr, run, evcn1 - next_svcn);
+		if (err)
+			goto out;
+
+		le->vcn = cpu_to_le64(next_svcn);
+		ni->attr_list.dirty = true;
+		mi->dirty = true;
+
+		next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+ins_ext:
+	if (evcn1 > next_svcn) {
+		err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run,
+					    next_svcn, evcn1 - next_svcn,
+					    attr_b->flags, &attr, &mi);
+		if (err)
+			goto out;
+	}
+ok:
+	run_truncate_around(run, vcn);
+out:
+	if (new_valid > data_size)
+		new_valid = data_size;
+
+	valid_size = le64_to_cpu(attr_b->nres.valid_size);
+	if (new_valid != valid_size) {
+		attr_b->nres.valid_size = cpu_to_le64(valid_size);
+		mi_b->dirty = true;
+	}
+
+	return err;
+}
+
+/* Collapse range in file */
+int attr_collapse_range(struct ntfs_inode *ni, u64 vbo, u64 bytes)
+{
+	int err = 0;
+	struct runs_tree *run = &ni->file.run;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	struct ATTRIB *attr = NULL, *attr_b;
+	struct ATTR_LIST_ENTRY *le, *le_b;
+	struct mft_inode *mi, *mi_b;
+	CLST svcn, evcn1, len, dealloc, alen;
+	CLST vcn, end;
+	u64 valid_size, data_size, alloc_size, total_size;
+	u32 mask;
+	__le16 a_flags;
+
+	if (!bytes)
+		return 0;
+
+	le_b = NULL;
+	attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+	if (!attr_b)
+		return -ENOENT;
+
+	if (!attr_b->non_res) {
+		/* Attribute is resident. Nothing to do? */
+		return 0;
+	}
+
+	data_size = le64_to_cpu(attr_b->nres.data_size);
+	alloc_size = le64_to_cpu(attr_b->nres.alloc_size);
+	a_flags = attr_b->flags;
+
+	if (is_attr_ext(attr_b)) {
+		total_size = le64_to_cpu(attr_b->nres.total_size);
+		mask = (1u << (attr_b->nres.c_unit + sbi->cluster_bits)) - 1;
+	} else {
+		total_size = alloc_size;
+		mask = sbi->cluster_mask;
+	}
+
+	if (vbo & mask)
+		return -EINVAL;
+
+	if (bytes & mask)
+		return -EINVAL;
+
+	if (vbo > data_size)
+		return -EINVAL;
+
+	down_write(&ni->file.run_lock);
+
+	if (vbo + bytes >= data_size) {
+		u64 new_valid = min(ni->i_valid, vbo);
+
+		/* Simple truncate file at 'vbo' */
+		truncate_setsize(&ni->vfs_inode, vbo);
+		err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, vbo,
+				    &new_valid, true, NULL);
+
+		if (!err && new_valid < ni->i_valid)
+			ni->i_valid = new_valid;
+
+		goto out;
+	}
+
+	/*
+	 * Enumerate all attribute segments and collapse
+	 */
+	alen = alloc_size >> sbi->cluster_bits;
+	vcn = vbo >> sbi->cluster_bits;
+	len = bytes >> sbi->cluster_bits;
+	end = vcn + len;
+	dealloc = 0;
+
+	svcn = le64_to_cpu(attr_b->nres.svcn);
+	evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+
+	if (svcn <= vcn && vcn < evcn1) {
+		attr = attr_b;
+		le = le_b;
+		mi = mi_b;
+	} else if (!le_b) {
+		err = -EINVAL;
+		goto out;
+	} else {
+		le = le_b;
+		attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+				    &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+
+	for (;;) {
+		if (svcn >= end) {
+			/* shift vcn */
+			attr->nres.svcn = cpu_to_le64(svcn - len);
+			attr->nres.evcn = cpu_to_le64(evcn1 - 1 - len);
+			if (le) {
+				le->vcn = attr->nres.svcn;
+				ni->attr_list.dirty = true;
+			}
+			mi->dirty = true;
+		} else if (svcn < vcn || end < evcn1) {
+			CLST vcn1, eat, next_svcn;
+
+			/* collapse a part of this attribute segment */
+			err = attr_load_runs(attr, ni, run, &svcn);
+			if (err)
+				goto out;
+			vcn1 = max(vcn, svcn);
+			eat = min(end, evcn1) - vcn1;
+
+			err = run_deallocate_ex(sbi, run, vcn1, eat, &dealloc,
+						true);
+			if (err)
+				goto out;
+
+			if (!run_collapse_range(run, vcn1, eat)) {
+				err = -ENOMEM;
+				goto out;
+			}
+
+			if (svcn >= vcn) {
+				/* shift vcn */
+				attr->nres.svcn = cpu_to_le64(vcn);
+				if (le) {
+					le->vcn = attr->nres.svcn;
+					ni->attr_list.dirty = true;
+				}
+			}
+
+			err = mi_pack_runs(mi, attr, run, evcn1 - svcn - eat);
+			if (err)
+				goto out;
+
+			next_svcn = le64_to_cpu(attr->nres.evcn) + 1;
+			if (next_svcn + eat < evcn1) {
+				err = ni_insert_nonresident(
+					ni, ATTR_DATA, NULL, 0, run, next_svcn,
+					evcn1 - eat - next_svcn, a_flags, &attr,
+					&mi);
+				if (err)
+					goto out;
+
+				/* layout of records maybe changed */
+				attr_b = NULL;
+				le = al_find_ex(ni, NULL, ATTR_DATA, NULL, 0,
+						&next_svcn);
+				if (!le) {
+					err = -EINVAL;
+					goto out;
+				}
+			}
+
+			/* free all allocated memory */
+			run_truncate(run, 0);
+		} else {
+			u16 le_sz;
+			u16 roff = le16_to_cpu(attr->nres.run_off);
+
+			/*run==1 means unpack and deallocate*/
+			run_unpack_ex(RUN_DEALLOCATE, sbi, ni->mi.rno, svcn,
+				      evcn1 - 1, svcn, Add2Ptr(attr, roff),
+				      le32_to_cpu(attr->size) - roff);
+
+			/* delete this attribute segment */
+			mi_remove_attr(mi, attr);
+			if (!le)
+				break;
+
+			le_sz = le16_to_cpu(le->size);
+			if (!al_remove_le(ni, le)) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			if (evcn1 >= alen)
+				break;
+
+			if (!svcn) {
+				/* Load next record that contains this attribute */
+				if (ni_load_mi(ni, le, &mi)) {
+					err = -EINVAL;
+					goto out;
+				}
+
+				/* Look for required attribute */
+				attr = mi_find_attr(mi, NULL, ATTR_DATA, NULL,
+						    0, &le->id);
+				if (!attr) {
+					err = -EINVAL;
+					goto out;
+				}
+				goto next_attr;
+			}
+			le = (struct ATTR_LIST_ENTRY *)((u8 *)le - le_sz);
+		}
+
+		if (evcn1 >= alen)
+			break;
+
+		attr = ni_enum_attr_ex(ni, attr, &le, &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+
+next_attr:
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+
+	if (!attr_b) {
+		le_b = NULL;
+		attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL,
+				      &mi_b);
+		if (!attr_b) {
+			err = -ENOENT;
+			goto out;
+		}
+	}
+
+	data_size -= bytes;
+	valid_size = ni->i_valid;
+	if (vbo + bytes <= valid_size)
+		valid_size -= bytes;
+	else if (vbo < valid_size)
+		valid_size = vbo;
+
+	attr_b->nres.alloc_size = cpu_to_le64(alloc_size - bytes);
+	attr_b->nres.data_size = cpu_to_le64(data_size);
+	attr_b->nres.valid_size = cpu_to_le64(min(valid_size, data_size));
+	total_size -= (u64)dealloc << sbi->cluster_bits;
+	if (is_attr_ext(attr_b))
+		attr_b->nres.total_size = cpu_to_le64(total_size);
+	mi_b->dirty = true;
+
+	/*update inode size*/
+	ni->i_valid = valid_size;
+	ni->vfs_inode.i_size = data_size;
+	inode_set_bytes(&ni->vfs_inode, total_size);
+	ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+	mark_inode_dirty(&ni->vfs_inode);
+
+out:
+	up_write(&ni->file.run_lock);
+	if (err)
+		make_bad_inode(&ni->vfs_inode);
+
+	return err;
+}
+
+/* not for normal files */
+int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u64 bytes)
+{
+	int err = 0;
+	struct runs_tree *run = &ni->file.run;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	struct ATTRIB *attr = NULL, *attr_b;
+	struct ATTR_LIST_ENTRY *le, *le_b;
+	struct mft_inode *mi, *mi_b;
+	CLST svcn, evcn1, vcn, len, end, alen, dealloc;
+	u64 total_size, alloc_size;
+
+	if (!bytes)
+		return 0;
+
+	le_b = NULL;
+	attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b);
+	if (!attr_b)
+		return -ENOENT;
+
+	if (!attr_b->non_res) {
+		u32 data_size = le32_to_cpu(attr->res.data_size);
+		u32 from, to;
+
+		if (vbo > data_size)
+			return 0;
+
+		from = vbo;
+		to = (vbo + bytes) < data_size ? (vbo + bytes) : data_size;
+		memset(Add2Ptr(resident_data(attr_b), from), 0, to - from);
+		return 0;
+	}
+
+	/* TODO: add support for normal files too */
+	if (!is_attr_ext(attr_b))
+		return -EOPNOTSUPP;
+
+	alloc_size = le64_to_cpu(attr_b->nres.alloc_size);
+	total_size = le64_to_cpu(attr_b->nres.total_size);
+
+	if (vbo >= alloc_size) {
+		// NOTE: it is allowed
+		return 0;
+	}
+
+	if (vbo + bytes > alloc_size)
+		bytes = alloc_size - vbo;
+
+	down_write(&ni->file.run_lock);
+	/*
+	 * Enumerate all attribute segments and punch hole where necessary
+	 */
+	alen = alloc_size >> sbi->cluster_bits;
+	vcn = vbo >> sbi->cluster_bits;
+	len = bytes >> sbi->cluster_bits;
+	end = vcn + len;
+	dealloc = 0;
+
+	svcn = le64_to_cpu(attr_b->nres.svcn);
+	evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1;
+
+	if (svcn <= vcn && vcn < evcn1) {
+		attr = attr_b;
+		le = le_b;
+		mi = mi_b;
+	} else if (!le_b) {
+		err = -EINVAL;
+		goto out;
+	} else {
+		le = le_b;
+		attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn,
+				    &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+
+	while (svcn < end) {
+		CLST vcn1, zero, dealloc2;
+
+		err = attr_load_runs(attr, ni, run, &svcn);
+		if (err)
+			goto out;
+		vcn1 = max(vcn, svcn);
+		zero = min(end, evcn1) - vcn1;
+
+		dealloc2 = dealloc;
+		err = run_deallocate_ex(sbi, run, vcn1, zero, &dealloc, true);
+		if (err)
+			goto out;
+
+		if (dealloc2 == dealloc) {
+			/* looks like  the required range is already sparsed */
+		} else {
+			if (!run_add_entry(run, vcn1, SPARSE_LCN, zero,
+					   false)) {
+				err = -ENOMEM;
+				goto out;
+			}
+
+			err = mi_pack_runs(mi, attr, run, evcn1 - svcn);
+			if (err)
+				goto out;
+		}
+		/* free all allocated memory */
+		run_truncate(run, 0);
+
+		if (evcn1 >= alen)
+			break;
+
+		attr = ni_enum_attr_ex(ni, attr, &le, &mi);
+		if (!attr) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn1 = le64_to_cpu(attr->nres.evcn) + 1;
+	}
+
+	total_size -= (u64)dealloc << sbi->cluster_bits;
+	attr_b->nres.total_size = cpu_to_le64(total_size);
+	mi_b->dirty = true;
+
+	/*update inode size*/
+	inode_set_bytes(&ni->vfs_inode, total_size);
+	ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+	mark_inode_dirty(&ni->vfs_inode);
+
+out:
+	up_write(&ni->file.run_lock);
+	if (err)
+		make_bad_inode(&ni->vfs_inode);
+
+	return err;
+}
diff --git a/fs/ntfs3/attrlist.c b/fs/ntfs3/attrlist.c
new file mode 100644
index 000000000000..4a97e259d377
--- /dev/null
+++ b/fs/ntfs3/attrlist.c
@@ -0,0 +1,457 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/* Returns true if le is valid */
+static inline bool al_is_valid_le(const struct ntfs_inode *ni,
+				  struct ATTR_LIST_ENTRY *le)
+{
+	if (!le || !ni->attr_list.le || !ni->attr_list.size)
+		return false;
+
+	return PtrOffset(ni->attr_list.le, le) + le16_to_cpu(le->size) <=
+	       ni->attr_list.size;
+}
+
+void al_destroy(struct ntfs_inode *ni)
+{
+	run_close(&ni->attr_list.run);
+	ntfs_free(ni->attr_list.le);
+	ni->attr_list.le = NULL;
+	ni->attr_list.size = 0;
+	ni->attr_list.dirty = false;
+}
+
+/*
+ * ntfs_load_attr_list
+ *
+ * This method makes sure that the ATTRIB list, if present,
+ * has been properly set up.
+ */
+int ntfs_load_attr_list(struct ntfs_inode *ni, struct ATTRIB *attr)
+{
+	int err;
+	size_t lsize;
+	void *le = NULL;
+
+	if (ni->attr_list.size)
+		return 0;
+
+	if (!attr->non_res) {
+		lsize = le32_to_cpu(attr->res.data_size);
+		le = ntfs_malloc(al_aligned(lsize));
+		if (!le) {
+			err = -ENOMEM;
+			goto out;
+		}
+		memcpy(le, resident_data(attr), lsize);
+	} else if (attr->nres.svcn) {
+		err = -EINVAL;
+		goto out;
+	} else {
+		u16 run_off = le16_to_cpu(attr->nres.run_off);
+
+		lsize = le64_to_cpu(attr->nres.data_size);
+
+		run_init(&ni->attr_list.run);
+
+		err = run_unpack_ex(&ni->attr_list.run, ni->mi.sbi, ni->mi.rno,
+				    0, le64_to_cpu(attr->nres.evcn), 0,
+				    Add2Ptr(attr, run_off),
+				    le32_to_cpu(attr->size) - run_off);
+		if (err < 0)
+			goto out;
+
+		le = ntfs_malloc(al_aligned(lsize));
+		if (!le) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		err = ntfs_read_run_nb(ni->mi.sbi, &ni->attr_list.run, 0, le,
+				       lsize, NULL);
+		if (err)
+			goto out;
+	}
+
+	ni->attr_list.size = lsize;
+	ni->attr_list.le = le;
+
+	return 0;
+
+out:
+	ni->attr_list.le = le;
+	al_destroy(ni);
+
+	return err;
+}
+
+/*
+ * al_enumerate
+ *
+ * Returns the next list le
+ * if le is NULL then returns the first le
+ */
+struct ATTR_LIST_ENTRY *al_enumerate(struct ntfs_inode *ni,
+				     struct ATTR_LIST_ENTRY *le)
+{
+	size_t off;
+	u16 sz;
+
+	if (!le) {
+		le = ni->attr_list.le;
+	} else {
+		sz = le16_to_cpu(le->size);
+		if (sz < sizeof(struct ATTR_LIST_ENTRY)) {
+			/* Impossible 'cause we should not return such le */
+			return NULL;
+		}
+		le = Add2Ptr(le, sz);
+	}
+
+	/* Check boundary */
+	off = PtrOffset(ni->attr_list.le, le);
+	if (off + sizeof(struct ATTR_LIST_ENTRY) > ni->attr_list.size) {
+		// The regular end of list
+		return NULL;
+	}
+
+	sz = le16_to_cpu(le->size);
+
+	/* Check le for errors */
+	if (sz < sizeof(struct ATTR_LIST_ENTRY) ||
+	    off + sz > ni->attr_list.size ||
+	    sz < le->name_off + le->name_len * sizeof(short)) {
+		return NULL;
+	}
+
+	return le;
+}
+
+/*
+ * al_find_le
+ *
+ * finds the first le in the list which matches type, name and vcn
+ * Returns NULL if not found
+ */
+struct ATTR_LIST_ENTRY *al_find_le(struct ntfs_inode *ni,
+				   struct ATTR_LIST_ENTRY *le,
+				   const struct ATTRIB *attr)
+{
+	CLST svcn = attr_svcn(attr);
+
+	return al_find_ex(ni, le, attr->type, attr_name(attr), attr->name_len,
+			  &svcn);
+}
+
+/*
+ * al_find_ex
+ *
+ * finds the first le in the list which matches type, name and vcn
+ * Returns NULL if not found
+ */
+struct ATTR_LIST_ENTRY *al_find_ex(struct ntfs_inode *ni,
+				   struct ATTR_LIST_ENTRY *le,
+				   enum ATTR_TYPE type, const __le16 *name,
+				   u8 name_len, const CLST *vcn)
+{
+	struct ATTR_LIST_ENTRY *ret = NULL;
+	u32 type_in = le32_to_cpu(type);
+
+	while ((le = al_enumerate(ni, le))) {
+		u64 le_vcn;
+		int diff = le32_to_cpu(le->type) - type_in;
+
+		/* List entries are sorted by type, name and vcn */
+		if (diff < 0)
+			continue;
+
+		if (diff > 0)
+			return ret;
+
+		if (le->name_len != name_len)
+			continue;
+
+		le_vcn = le64_to_cpu(le->vcn);
+		if (!le_vcn) {
+			/*
+			 * compare entry names only for entry with vcn == 0
+			 */
+			diff = ntfs_cmp_names(le_name(le), name_len, name,
+					      name_len, ni->mi.sbi->upcase,
+					      true);
+			if (diff < 0)
+				continue;
+
+			if (diff > 0)
+				return ret;
+		}
+
+		if (!vcn)
+			return le;
+
+		if (*vcn == le_vcn)
+			return le;
+
+		if (*vcn < le_vcn)
+			return ret;
+
+		ret = le;
+	}
+
+	return ret;
+}
+
+/*
+ * al_find_le_to_insert
+ *
+ * finds the first list entry which matches type, name and vcn
+ */
+static struct ATTR_LIST_ENTRY *al_find_le_to_insert(struct ntfs_inode *ni,
+						    enum ATTR_TYPE type,
+						    const __le16 *name,
+						    u8 name_len, CLST vcn)
+{
+	struct ATTR_LIST_ENTRY *le = NULL, *prev;
+	u32 type_in = le32_to_cpu(type);
+
+	/* List entries are sorted by type, name, vcn */
+	while ((le = al_enumerate(ni, prev = le))) {
+		int diff = le32_to_cpu(le->type) - type_in;
+
+		if (diff < 0)
+			continue;
+
+		if (diff > 0)
+			return le;
+
+		if (!le->vcn) {
+			/*
+			 * compare entry names only for entry with vcn == 0
+			 */
+			diff = ntfs_cmp_names(le_name(le), le->name_len, name,
+					      name_len, ni->mi.sbi->upcase,
+					      true);
+			if (diff < 0)
+				continue;
+
+			if (diff > 0)
+				return le;
+		}
+
+		if (le64_to_cpu(le->vcn) >= vcn)
+			return le;
+	}
+
+	return prev ? Add2Ptr(prev, le16_to_cpu(prev->size)) : ni->attr_list.le;
+}
+
+/*
+ * al_add_le
+ *
+ * adds an "attribute list entry" to the list.
+ */
+int al_add_le(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name,
+	      u8 name_len, CLST svcn, __le16 id, const struct MFT_REF *ref,
+	      struct ATTR_LIST_ENTRY **new_le)
+{
+	int err;
+	struct ATTRIB *attr;
+	struct ATTR_LIST_ENTRY *le;
+	size_t off;
+	u16 sz;
+	size_t asize, new_asize;
+	u64 new_size;
+	typeof(ni->attr_list) *al = &ni->attr_list;
+
+	/*
+	 * Compute the size of the new le and the new length of the
+	 * list with al le added.
+	 */
+	sz = le_size(name_len);
+	new_size = al->size + sz;
+	asize = al_aligned(al->size);
+	new_asize = al_aligned(new_size);
+
+	/* Scan forward to the point at which the new le should be inserted. */
+	le = al_find_le_to_insert(ni, type, name, name_len, svcn);
+	off = PtrOffset(al->le, le);
+
+	if (new_size > asize) {
+		void *ptr = ntfs_malloc(new_asize);
+
+		if (!ptr)
+			return -ENOMEM;
+
+		memcpy(ptr, al->le, off);
+		memcpy(Add2Ptr(ptr, off + sz), le, al->size - off);
+		le = Add2Ptr(ptr, off);
+		ntfs_free(al->le);
+		al->le = ptr;
+	} else {
+		memmove(Add2Ptr(le, sz), le, al->size - off);
+	}
+
+	al->size = new_size;
+
+	le->type = type;
+	le->size = cpu_to_le16(sz);
+	le->name_len = name_len;
+	le->name_off = offsetof(struct ATTR_LIST_ENTRY, name);
+	le->vcn = cpu_to_le64(svcn);
+	le->ref = *ref;
+	le->id = id;
+	memcpy(le->name, name, sizeof(short) * name_len);
+
+	al->dirty = true;
+
+	err = attr_set_size(ni, ATTR_LIST, NULL, 0, &al->run, new_size,
+			    &new_size, true, &attr);
+	if (err)
+		return err;
+
+	if (attr && attr->non_res) {
+		err = ntfs_sb_write_run(ni->mi.sbi, &al->run, 0, al->le,
+					al->size);
+		if (err)
+			return err;
+	}
+
+	al->dirty = false;
+	*new_le = le;
+
+	return 0;
+}
+
+/*
+ * al_remove_le
+ *
+ * removes 'le' from attribute list
+ */
+bool al_remove_le(struct ntfs_inode *ni, struct ATTR_LIST_ENTRY *le)
+{
+	u16 size;
+	size_t off;
+	typeof(ni->attr_list) *al = &ni->attr_list;
+
+	if (!al_is_valid_le(ni, le))
+		return false;
+
+	/* Save on stack the size of le */
+	size = le16_to_cpu(le->size);
+	off = PtrOffset(al->le, le);
+
+	memmove(le, Add2Ptr(le, size), al->size - (off + size));
+
+	al->size -= size;
+	al->dirty = true;
+
+	return true;
+}
+
+/*
+ * al_delete_le
+ *
+ * deletes from the list the first le which matches its parameters.
+ */
+bool al_delete_le(struct ntfs_inode *ni, enum ATTR_TYPE type, CLST vcn,
+		  const __le16 *name, size_t name_len,
+		  const struct MFT_REF *ref)
+{
+	u16 size;
+	struct ATTR_LIST_ENTRY *le;
+	size_t off;
+	typeof(ni->attr_list) *al = &ni->attr_list;
+
+	/* Scan forward to the first le that matches the input */
+	le = al_find_ex(ni, NULL, type, name, name_len, &vcn);
+	if (!le)
+		return false;
+
+	off = PtrOffset(al->le, le);
+
+next:
+	if (off >= al->size)
+		return false;
+	if (le->type != type)
+		return false;
+	if (le->name_len != name_len)
+		return false;
+	if (name_len && ntfs_cmp_names(le_name(le), name_len, name, name_len,
+				       ni->mi.sbi->upcase, true))
+		return false;
+	if (le64_to_cpu(le->vcn) != vcn)
+		return false;
+
+	/*
+	 * The caller specified a segment reference, so we have to
+	 * scan through the matching entries until we find that segment
+	 * reference or we run of matching entries.
+	 */
+	if (ref && memcmp(ref, &le->ref, sizeof(*ref))) {
+		off += le16_to_cpu(le->size);
+		le = Add2Ptr(al->le, off);
+		goto next;
+	}
+
+	/* Save on stack the size of le */
+	size = le16_to_cpu(le->size);
+	/* Delete the le. */
+	memmove(le, Add2Ptr(le, size), al->size - (off + size));
+
+	al->size -= size;
+	al->dirty = true;
+
+	return true;
+}
+
+/*
+ * al_update
+ */
+int al_update(struct ntfs_inode *ni)
+{
+	int err;
+	struct ATTRIB *attr;
+	typeof(ni->attr_list) *al = &ni->attr_list;
+
+	if (!al->dirty || !al->size)
+		return 0;
+
+	/*
+	 * attribute list increased on demand in al_add_le
+	 * attribute list decreased here
+	 */
+	err = attr_set_size(ni, ATTR_LIST, NULL, 0, &al->run, al->size, NULL,
+			    false, &attr);
+	if (err)
+		goto out;
+
+	if (!attr->non_res) {
+		memcpy(resident_data(attr), al->le, al->size);
+	} else {
+		err = ntfs_sb_write_run(ni->mi.sbi, &al->run, 0, al->le,
+					al->size);
+		if (err)
+			goto out;
+
+		attr->nres.valid_size = attr->nres.data_size;
+	}
+
+	ni->mi.dirty = true;
+	al->dirty = false;
+
+out:
+	return err;
+}
diff --git a/fs/ntfs3/xattr.c b/fs/ntfs3/xattr.c
new file mode 100644
index 000000000000..cc69295eb9cf
--- /dev/null
+++ b/fs/ntfs3/xattr.c
@@ -0,0 +1,1072 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+#include <linux/posix_acl.h>
+#include <linux/posix_acl_xattr.h>
+#include <linux/xattr.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+// clang-format off
+#define SYSTEM_DOS_ATTRIB    "system.dos_attrib"
+#define SYSTEM_NTFS_ATTRIB   "system.ntfs_attrib"
+#define SYSTEM_NTFS_SECURITY "system.ntfs_security"
+#define USER_DOSATTRIB       "user.DOSATTRIB"
+// clang-format on
+
+static inline size_t unpacked_ea_size(const struct EA_FULL *ea)
+{
+	return !ea->size ? DwordAlign(offsetof(struct EA_FULL, name) + 1 +
+				      ea->name_len + le16_to_cpu(ea->elength)) :
+			   le32_to_cpu(ea->size);
+}
+
+static inline size_t packed_ea_size(const struct EA_FULL *ea)
+{
+	return offsetof(struct EA_FULL, name) + 1 -
+	       offsetof(struct EA_FULL, flags) + ea->name_len +
+	       le16_to_cpu(ea->elength);
+}
+
+/*
+ * find_ea
+ *
+ * assume there is at least one xattr in the list
+ */
+static inline bool find_ea(const struct EA_FULL *ea_all, u32 bytes,
+			   const char *name, u8 name_len, u32 *off)
+{
+	*off = 0;
+
+	if (!ea_all || !bytes)
+		return false;
+
+	for (;;) {
+		const struct EA_FULL *ea = Add2Ptr(ea_all, *off);
+		u32 next_off = *off + unpacked_ea_size(ea);
+
+		if (next_off > bytes)
+			return false;
+
+		if (ea->name_len == name_len &&
+		    !memcmp(ea->name, name, name_len))
+			return true;
+
+		*off = next_off;
+		if (next_off >= bytes)
+			return false;
+	}
+}
+
+/*
+ * ntfs_read_ea
+ *
+ * reads all extended attributes
+ * ea - new allocated memory
+ * info - pointer into resident data
+ */
+static int ntfs_read_ea(struct ntfs_inode *ni, struct EA_FULL **ea,
+			size_t add_bytes, const struct EA_INFO **info)
+{
+	int err;
+	struct ATTR_LIST_ENTRY *le = NULL;
+	struct ATTRIB *attr_info, *attr_ea;
+	void *ea_p;
+	u32 size;
+
+	static_assert(le32_to_cpu(ATTR_EA_INFO) < le32_to_cpu(ATTR_EA));
+
+	*ea = NULL;
+	*info = NULL;
+
+	attr_info =
+		ni_find_attr(ni, NULL, &le, ATTR_EA_INFO, NULL, 0, NULL, NULL);
+	attr_ea =
+		ni_find_attr(ni, attr_info, &le, ATTR_EA, NULL, 0, NULL, NULL);
+
+	if (!attr_ea || !attr_info)
+		return 0;
+
+	*info = resident_data_ex(attr_info, sizeof(struct EA_INFO));
+	if (!*info)
+		return -EINVAL;
+
+	/* Check Ea limit */
+	size = le32_to_cpu((*info)->size);
+	if (size > MAX_EA_DATA_SIZE || size + add_bytes > MAX_EA_DATA_SIZE)
+		return -EINVAL;
+
+	/* Allocate memory for packed Ea */
+	ea_p = ntfs_malloc(size + add_bytes);
+	if (!ea_p)
+		return -ENOMEM;
+
+	if (attr_ea->non_res) {
+		struct runs_tree run;
+
+		run_init(&run);
+
+		err = attr_load_runs(attr_ea, ni, &run, NULL);
+		if (!err)
+			err = ntfs_read_run_nb(ni->mi.sbi, &run, 0, ea_p, size,
+					       NULL);
+		run_close(&run);
+
+		if (err)
+			goto out;
+	} else {
+		void *p = resident_data_ex(attr_ea, size);
+
+		if (!p) {
+			err = -EINVAL;
+			goto out;
+		}
+		memcpy(ea_p, p, size);
+	}
+
+	memset(Add2Ptr(ea_p, size), 0, add_bytes);
+	*ea = ea_p;
+	return 0;
+
+out:
+	ntfs_free(ea_p);
+	*ea = NULL;
+	return err;
+}
+
+/*
+ * ntfs_listxattr_hlp
+ *
+ * copy a list of xattrs names into the buffer
+ * provided, or compute the buffer size required
+ */
+static int ntfs_listxattr_hlp(struct ntfs_inode *ni, char *buffer,
+			      size_t bytes_per_buffer, size_t *bytes)
+{
+	const struct EA_INFO *info;
+	struct EA_FULL *ea_all = NULL;
+	const struct EA_FULL *ea;
+	u32 off, size;
+	int err;
+
+	*bytes = 0;
+
+	err = ntfs_read_ea(ni, &ea_all, 0, &info);
+	if (err)
+		return err;
+
+	if (!info || !ea_all)
+		return 0;
+
+	size = le32_to_cpu(info->size);
+
+	/* Enumerate all xattrs */
+	for (off = 0; off < size; off += unpacked_ea_size(ea)) {
+		ea = Add2Ptr(ea_all, off);
+
+		if (buffer) {
+			if (*bytes + ea->name_len + 1 > bytes_per_buffer) {
+				err = -ERANGE;
+				goto out;
+			}
+
+			memcpy(buffer + *bytes, ea->name, ea->name_len);
+			buffer[*bytes + ea->name_len] = 0;
+		}
+
+		*bytes += ea->name_len + 1;
+	}
+
+out:
+	ntfs_free(ea_all);
+	return err;
+}
+
+/*
+ * ntfs_get_ea
+ *
+ * reads xattr
+ */
+static int ntfs_get_ea(struct ntfs_inode *ni, const char *name, size_t name_len,
+		       void *buffer, size_t bytes_per_buffer, u32 *len)
+{
+	const struct EA_INFO *info;
+	struct EA_FULL *ea_all = NULL;
+	const struct EA_FULL *ea;
+	u32 off;
+	int err;
+
+	*len = 0;
+
+	if (name_len > 255) {
+		err = -ENAMETOOLONG;
+		goto out;
+	}
+
+	err = ntfs_read_ea(ni, &ea_all, 0, &info);
+	if (err)
+		goto out;
+
+	if (!info)
+		goto out;
+
+	/* Enumerate all xattrs */
+	if (!find_ea(ea_all, le32_to_cpu(info->size), name, name_len, &off)) {
+		err = -ENODATA;
+		goto out;
+	}
+	ea = Add2Ptr(ea_all, off);
+
+	*len = le16_to_cpu(ea->elength);
+	if (!buffer) {
+		err = 0;
+		goto out;
+	}
+
+	if (*len > bytes_per_buffer) {
+		err = -ERANGE;
+		goto out;
+	}
+	memcpy(buffer, ea->name + ea->name_len + 1, *len);
+	err = 0;
+
+out:
+	ntfs_free(ea_all);
+
+	return err;
+}
+
+static noinline int ntfs_getxattr_hlp(struct inode *inode, const char *name,
+				      void *value, size_t size,
+				      size_t *required)
+{
+	struct ntfs_inode *ni = ntfs_i(inode);
+	int err;
+	u32 len;
+
+	if (!(ni->ni_flags & NI_FLAG_EA))
+		return -ENODATA;
+
+	if (!required)
+		ni_lock(ni);
+
+	err = ntfs_get_ea(ni, name, strlen(name), value, size, &len);
+	if (!err)
+		err = len;
+	else if (-ERANGE == err && required)
+		*required = len;
+
+	if (!required)
+		ni_unlock(ni);
+
+	return err;
+}
+
+static noinline int ntfs_set_ea(struct inode *inode, const char *name,
+				const void *value, size_t val_size, int flags,
+				int locked)
+{
+	struct ntfs_inode *ni = ntfs_i(inode);
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	int err;
+	struct EA_INFO ea_info;
+	const struct EA_INFO *info;
+	struct EA_FULL *new_ea;
+	struct EA_FULL *ea_all = NULL;
+	size_t name_len, add;
+	u32 off, size;
+	__le16 size_pack;
+	struct ATTRIB *attr;
+	struct ATTR_LIST_ENTRY *le;
+	struct mft_inode *mi;
+	struct runs_tree ea_run;
+	u64 new_sz;
+	void *p;
+
+	if (!locked)
+		ni_lock(ni);
+
+	run_init(&ea_run);
+	name_len = strlen(name);
+
+	if (name_len > 255) {
+		err = -ENAMETOOLONG;
+		goto out;
+	}
+
+	add = DwordAlign(offsetof(struct EA_FULL, name) + 1 + name_len +
+			 val_size);
+
+	err = ntfs_read_ea(ni, &ea_all, add, &info);
+	if (err)
+		goto out;
+
+	if (!info) {
+		memset(&ea_info, 0, sizeof(ea_info));
+		size = 0;
+		size_pack = 0;
+	} else {
+		memcpy(&ea_info, info, sizeof(ea_info));
+		size = le32_to_cpu(ea_info.size);
+		size_pack = ea_info.size_pack;
+	}
+
+	if (info && find_ea(ea_all, size, name, name_len, &off)) {
+		struct EA_FULL *ea;
+		size_t ea_sz;
+
+		if (flags & XATTR_CREATE) {
+			err = -EEXIST;
+			goto out;
+		}
+
+		/* Remove current xattr */
+		ea = Add2Ptr(ea_all, off);
+		if (ea->flags & FILE_NEED_EA)
+			le16_add_cpu(&ea_info.count, -1);
+
+		ea_sz = unpacked_ea_size(ea);
+
+		le16_add_cpu(&ea_info.size_pack, 0 - packed_ea_size(ea));
+
+		memmove(ea, Add2Ptr(ea, ea_sz), size - off - ea_sz);
+
+		size -= ea_sz;
+		memset(Add2Ptr(ea_all, size), 0, ea_sz);
+
+		ea_info.size = cpu_to_le32(size);
+
+		if ((flags & XATTR_REPLACE) && !val_size)
+			goto update_ea;
+	} else {
+		if (flags & XATTR_REPLACE) {
+			err = -ENODATA;
+			goto out;
+		}
+
+		if (!ea_all) {
+			ea_all = ntfs_zalloc(add);
+			if (!ea_all) {
+				err = -ENOMEM;
+				goto out;
+			}
+		}
+	}
+
+	/* append new xattr */
+	new_ea = Add2Ptr(ea_all, size);
+	new_ea->size = cpu_to_le32(add);
+	new_ea->flags = 0;
+	new_ea->name_len = name_len;
+	new_ea->elength = cpu_to_le16(val_size);
+	memcpy(new_ea->name, name, name_len);
+	new_ea->name[name_len] = 0;
+	memcpy(new_ea->name + name_len + 1, value, val_size);
+
+	le16_add_cpu(&ea_info.size_pack, packed_ea_size(new_ea));
+	size += add;
+	ea_info.size = cpu_to_le32(size);
+
+update_ea:
+
+	if (!info) {
+		/* Create xattr */
+		if (!size) {
+			err = 0;
+			goto out;
+		}
+
+		err = ni_insert_resident(ni, sizeof(struct EA_INFO),
+					 ATTR_EA_INFO, NULL, 0, NULL, NULL);
+		if (err)
+			goto out;
+
+		err = ni_insert_resident(ni, 0, ATTR_EA, NULL, 0, NULL, NULL);
+		if (err)
+			goto out;
+	}
+
+	new_sz = size;
+	err = attr_set_size(ni, ATTR_EA, NULL, 0, &ea_run, new_sz, &new_sz,
+			    false, NULL);
+	if (err)
+		goto out;
+
+	le = NULL;
+	attr = ni_find_attr(ni, NULL, &le, ATTR_EA_INFO, NULL, 0, NULL, &mi);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (!size) {
+		/* delete xattr, ATTR_EA_INFO */
+		err = ni_remove_attr_le(ni, attr, le);
+		if (err)
+			goto out;
+	} else {
+		p = resident_data_ex(attr, sizeof(struct EA_INFO));
+		if (!p) {
+			err = -EINVAL;
+			goto out;
+		}
+		memcpy(p, &ea_info, sizeof(struct EA_INFO));
+		mi->dirty = true;
+	}
+
+	le = NULL;
+	attr = ni_find_attr(ni, NULL, &le, ATTR_EA, NULL, 0, NULL, &mi);
+	if (!attr) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (!size) {
+		/* delete xattr, ATTR_EA */
+		err = ni_remove_attr_le(ni, attr, le);
+		if (err)
+			goto out;
+	} else if (attr->non_res) {
+		err = ntfs_sb_write_run(sbi, &ea_run, 0, ea_all, size);
+		if (err)
+			goto out;
+	} else {
+		p = resident_data_ex(attr, size);
+		if (!p) {
+			err = -EINVAL;
+			goto out;
+		}
+		memcpy(p, ea_all, size);
+		mi->dirty = true;
+	}
+
+	if (ea_info.size_pack != size_pack)
+		ni->ni_flags |= NI_FLAG_UPDATE_PARENT;
+	mark_inode_dirty(&ni->vfs_inode);
+
+	/* Check if we delete the last xattr */
+	if (val_size || flags != XATTR_REPLACE ||
+	    ntfs_listxattr_hlp(ni, NULL, 0, &val_size) || val_size) {
+		ni->ni_flags |= NI_FLAG_EA;
+	} else {
+		ni->ni_flags &= ~NI_FLAG_EA;
+	}
+
+out:
+	if (!locked)
+		ni_unlock(ni);
+
+	run_close(&ea_run);
+	ntfs_free(ea_all);
+
+	return err;
+}
+
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+static inline void ntfs_posix_acl_release(struct posix_acl *acl)
+{
+	if (acl && refcount_dec_and_test(&acl->a_refcount))
+		kfree(acl);
+}
+
+static struct posix_acl *ntfs_get_acl_ex(struct inode *inode, int type,
+					 int locked)
+{
+	struct ntfs_inode *ni = ntfs_i(inode);
+	const char *name;
+	struct posix_acl *acl;
+	size_t req;
+	int err;
+	void *buf;
+
+	/* allocate PATH_MAX bytes */
+	buf = __getname();
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	/* Possible values of 'type' was already checked above */
+	name = type == ACL_TYPE_ACCESS ? XATTR_NAME_POSIX_ACL_ACCESS :
+					 XATTR_NAME_POSIX_ACL_DEFAULT;
+
+	if (!locked)
+		ni_lock(ni);
+
+	err = ntfs_getxattr_hlp(inode, name, buf, PATH_MAX, &req);
+
+	if (!locked)
+		ni_unlock(ni);
+
+	/* Translate extended attribute to acl */
+	if (err > 0) {
+		acl = posix_acl_from_xattr(&init_user_ns, buf, err);
+		if (!IS_ERR(acl))
+			set_cached_acl(inode, type, acl);
+	} else {
+		acl = err == -ENODATA ? NULL : ERR_PTR(err);
+	}
+
+	__putname(buf);
+
+	return acl;
+}
+
+/*
+ * ntfs_get_acl
+ *
+ * inode_operations::get_acl
+ */
+struct posix_acl *ntfs_get_acl(struct inode *inode, int type)
+{
+	return ntfs_get_acl_ex(inode, type, 0);
+}
+
+static noinline int ntfs_set_acl_ex(struct inode *inode, struct posix_acl *acl,
+				    int type, int locked)
+{
+	const char *name;
+	size_t size;
+	void *value = NULL;
+	int err = 0;
+
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+
+	switch (type) {
+	case ACL_TYPE_ACCESS:
+		if (acl) {
+			umode_t mode = inode->i_mode;
+
+			err = posix_acl_equiv_mode(acl, &mode);
+			if (err < 0)
+				return err;
+
+			if (inode->i_mode != mode) {
+				inode->i_mode = mode;
+				mark_inode_dirty(inode);
+			}
+
+			if (!err) {
+				/*
+				 * acl can be exactly represented in the
+				 * traditional file mode permission bits
+				 */
+				acl = NULL;
+				goto out;
+			}
+		}
+		name = XATTR_NAME_POSIX_ACL_ACCESS;
+		break;
+
+	case ACL_TYPE_DEFAULT:
+		if (!S_ISDIR(inode->i_mode))
+			return acl ? -EACCES : 0;
+		name = XATTR_NAME_POSIX_ACL_DEFAULT;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	if (!acl)
+		goto out;
+
+	size = posix_acl_xattr_size(acl->a_count);
+	value = ntfs_malloc(size);
+	if (!value)
+		return -ENOMEM;
+
+	err = posix_acl_to_xattr(&init_user_ns, acl, value, size);
+	if (err)
+		goto out;
+
+	err = ntfs_set_ea(inode, name, value, size, 0, locked);
+	if (err)
+		goto out;
+
+	inode->i_flags &= ~S_NOSEC;
+
+out:
+	if (!err)
+		set_cached_acl(inode, type, acl);
+
+	kfree(value);
+
+	return err;
+}
+
+/*
+ * ntfs_set_acl
+ *
+ * inode_operations::set_acl
+ */
+int ntfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
+{
+	return ntfs_set_acl_ex(inode, acl, type, 0);
+}
+
+static int ntfs_xattr_get_acl(struct inode *inode, int type, void *buffer,
+			      size_t size)
+{
+	struct super_block *sb = inode->i_sb;
+	struct posix_acl *acl;
+	int err;
+
+	if (!(sb->s_flags & SB_POSIXACL))
+		return -EOPNOTSUPP;
+
+	acl = ntfs_get_acl(inode, type);
+	if (IS_ERR(acl))
+		return PTR_ERR(acl);
+
+	if (!acl)
+		return -ENODATA;
+
+	err = posix_acl_to_xattr(&init_user_ns, acl, buffer, size);
+	ntfs_posix_acl_release(acl);
+
+	return err;
+}
+
+static int ntfs_xattr_set_acl(struct inode *inode, int type, const void *value,
+			      size_t size)
+{
+	struct super_block *sb = inode->i_sb;
+	struct posix_acl *acl;
+	int err;
+
+	if (!(sb->s_flags & SB_POSIXACL))
+		return -EOPNOTSUPP;
+
+	if (!inode_owner_or_capable(inode))
+		return -EPERM;
+
+	if (!value)
+		return 0;
+
+	acl = posix_acl_from_xattr(&init_user_ns, value, size);
+	if (IS_ERR(acl))
+		return PTR_ERR(acl);
+
+	if (acl) {
+		err = posix_acl_valid(sb->s_user_ns, acl);
+		if (err)
+			goto release_and_out;
+	}
+
+	err = ntfs_set_acl(inode, acl, type);
+
+release_and_out:
+	ntfs_posix_acl_release(acl);
+	return err;
+}
+
+/*
+ * Initialize the ACLs of a new inode. Called from ntfs_create_inode.
+ */
+int ntfs_init_acl(struct inode *inode, struct inode *dir)
+{
+	struct posix_acl *default_acl, *acl;
+	int err;
+
+	/*
+	 * TODO refactoring lock
+	 * ni_lock(dir) ... -> posix_acl_create(dir,...) -> ntfs_get_acl -> ni_lock(dir)
+	 */
+	inode->i_default_acl = NULL;
+
+	default_acl = ntfs_get_acl_ex(dir, ACL_TYPE_DEFAULT, 1);
+
+	if (!default_acl || default_acl == ERR_PTR(-EOPNOTSUPP)) {
+		inode->i_mode &= ~current_umask();
+		err = 0;
+		goto out;
+	}
+
+	if (IS_ERR(default_acl)) {
+		err = PTR_ERR(default_acl);
+		goto out;
+	}
+
+	acl = default_acl;
+	err = __posix_acl_create(&acl, GFP_NOFS, &inode->i_mode);
+	if (err < 0)
+		goto out1;
+	if (!err) {
+		posix_acl_release(acl);
+		acl = NULL;
+	}
+
+	if (!S_ISDIR(inode->i_mode)) {
+		posix_acl_release(default_acl);
+		default_acl = NULL;
+	}
+
+	if (default_acl)
+		err = ntfs_set_acl_ex(inode, default_acl, ACL_TYPE_DEFAULT, 1);
+
+	if (!acl)
+		inode->i_acl = NULL;
+	else if (!err)
+		err = ntfs_set_acl_ex(inode, acl, ACL_TYPE_ACCESS, 1);
+
+	posix_acl_release(acl);
+out1:
+	posix_acl_release(default_acl);
+
+out:
+	return err;
+}
+#endif
+
+/*
+ * ntfs_acl_chmod
+ *
+ * helper for 'ntfs3_setattr'
+ */
+int ntfs_acl_chmod(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+
+	if (!(sb->s_flags & SB_POSIXACL))
+		return 0;
+
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+
+	return posix_acl_chmod(inode, inode->i_mode);
+}
+
+/*
+ * ntfs_permission
+ *
+ * inode_operations::permission
+ */
+int ntfs_permission(struct inode *inode, int mask)
+{
+	if (ntfs_sb(inode->i_sb)->options.no_acs_rules) {
+		/* "no access rules" mode - allow all changes */
+		return 0;
+	}
+
+	return generic_permission(inode, mask);
+}
+
+/*
+ * ntfs_listxattr
+ *
+ * inode_operations::listxattr
+ */
+ssize_t ntfs_listxattr(struct dentry *dentry, char *buffer, size_t size)
+{
+	struct inode *inode = d_inode(dentry);
+	struct ntfs_inode *ni = ntfs_i(inode);
+	ssize_t ret = -1;
+	int err;
+
+	if (!(ni->ni_flags & NI_FLAG_EA)) {
+		ret = 0;
+		goto out;
+	}
+
+	ni_lock(ni);
+
+	err = ntfs_listxattr_hlp(ni, buffer, size, (size_t *)&ret);
+
+	ni_unlock(ni);
+
+	if (err)
+		ret = err;
+out:
+
+	return ret;
+}
+
+static int ntfs_getxattr(const struct xattr_handler *handler, struct dentry *de,
+			 struct inode *inode, const char *name, void *buffer,
+			 size_t size)
+{
+	int err;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	size_t name_len = strlen(name);
+
+	/* Dispatch request */
+	if (name_len == sizeof(SYSTEM_DOS_ATTRIB) - 1 &&
+	    !memcmp(name, SYSTEM_DOS_ATTRIB, sizeof(SYSTEM_DOS_ATTRIB))) {
+		/* system.dos_attrib */
+		if (!buffer) {
+			err = sizeof(u8);
+		} else if (size < sizeof(u8)) {
+			err = -ENODATA;
+		} else {
+			err = sizeof(u8);
+			*(u8 *)buffer = le32_to_cpu(ni->std_fa);
+		}
+		goto out;
+	}
+
+	if (name_len == sizeof(SYSTEM_NTFS_ATTRIB) - 1 &&
+	    !memcmp(name, SYSTEM_NTFS_ATTRIB, sizeof(SYSTEM_NTFS_ATTRIB))) {
+		/* system.ntfs_attrib */
+		if (!buffer) {
+			err = sizeof(u32);
+		} else if (size < sizeof(u32)) {
+			err = -ENODATA;
+		} else {
+			err = sizeof(u32);
+			*(u32 *)buffer = le32_to_cpu(ni->std_fa);
+		}
+		goto out;
+	}
+
+	if (name_len == sizeof(USER_DOSATTRIB) - 1 &&
+	    !memcmp(name, USER_DOSATTRIB, sizeof(USER_DOSATTRIB))) {
+		/* user.DOSATTRIB */
+		if (!buffer) {
+			err = 5;
+		} else if (size < 5) {
+			err = -ENODATA;
+		} else {
+			err = sprintf((char *)buffer, "0x%x",
+				      le32_to_cpu(ni->std_fa) & 0xff) +
+			      1;
+		}
+		goto out;
+	}
+
+	if (name_len == sizeof(SYSTEM_NTFS_SECURITY) - 1 &&
+	    !memcmp(name, SYSTEM_NTFS_SECURITY, sizeof(SYSTEM_NTFS_SECURITY))) {
+		/* system.ntfs_security*/
+		struct SECURITY_DESCRIPTOR_RELATIVE *sd = NULL;
+		size_t sd_size = 0;
+
+		if (!is_ntfs3(ni->mi.sbi)) {
+			/* we should get nt4 security */
+			err = -EINVAL;
+			goto out;
+		} else if (le32_to_cpu(ni->std_security_id) <
+			   SECURITY_ID_FIRST) {
+			err = -ENOENT;
+			goto out;
+		}
+
+		err = ntfs_get_security_by_id(ni->mi.sbi, ni->std_security_id,
+					      &sd, &sd_size);
+		if (err)
+			goto out;
+
+		if (!is_sd_valid(sd, sd_size)) {
+			ntfs_inode_warn(
+				inode,
+				"looks like you get incorrect security descriptor id=%u",
+				ni->std_security_id);
+		}
+
+		if (!buffer) {
+			err = sd_size;
+		} else if (size < sd_size) {
+			err = -ENODATA;
+		} else {
+			err = sd_size;
+			memcpy(buffer, sd, sd_size);
+		}
+		ntfs_free(sd);
+		goto out;
+	}
+
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+	if ((name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1 &&
+	     !memcmp(name, XATTR_NAME_POSIX_ACL_ACCESS,
+		     sizeof(XATTR_NAME_POSIX_ACL_ACCESS))) ||
+	    (name_len == sizeof(XATTR_NAME_POSIX_ACL_DEFAULT) - 1 &&
+	     !memcmp(name, XATTR_NAME_POSIX_ACL_DEFAULT,
+		     sizeof(XATTR_NAME_POSIX_ACL_DEFAULT)))) {
+		err = ntfs_xattr_get_acl(
+			inode,
+			name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1 ?
+				ACL_TYPE_ACCESS :
+				ACL_TYPE_DEFAULT,
+			buffer, size);
+		goto out;
+	}
+#endif
+	/* deal with ntfs extended attribute */
+	err = ntfs_getxattr_hlp(inode, name, buffer, size, NULL);
+
+out:
+	return err;
+}
+
+/*
+ * ntfs_setxattr
+ *
+ * inode_operations::setxattr
+ */
+static noinline int ntfs_setxattr(const struct xattr_handler *handler,
+				  struct dentry *de, struct inode *inode,
+				  const char *name, const void *value,
+				  size_t size, int flags)
+{
+	int err = -EINVAL;
+	struct ntfs_inode *ni = ntfs_i(inode);
+	size_t name_len = strlen(name);
+	enum FILE_ATTRIBUTE new_fa;
+
+	/* Dispatch request */
+	if (name_len == sizeof(SYSTEM_DOS_ATTRIB) - 1 &&
+	    !memcmp(name, SYSTEM_DOS_ATTRIB, sizeof(SYSTEM_DOS_ATTRIB))) {
+		if (sizeof(u8) != size)
+			goto out;
+		new_fa = cpu_to_le32(*(u8 *)value);
+		goto set_new_fa;
+	}
+
+	if (name_len == sizeof(SYSTEM_NTFS_ATTRIB) - 1 &&
+	    !memcmp(name, SYSTEM_NTFS_ATTRIB, sizeof(SYSTEM_NTFS_ATTRIB))) {
+		if (size != sizeof(u32))
+			goto out;
+		new_fa = cpu_to_le32(*(u32 *)value);
+
+		if (S_ISREG(inode->i_mode)) {
+			/* Process compressed/sparsed in special way*/
+			ni_lock(ni);
+			err = ni_new_attr_flags(ni, new_fa);
+			ni_unlock(ni);
+			if (err)
+				goto out;
+		}
+		goto set_new_fa;
+	}
+
+	if (name_len == sizeof(USER_DOSATTRIB) - 1 &&
+	    !memcmp(name, USER_DOSATTRIB, sizeof(USER_DOSATTRIB))) {
+		u32 attrib;
+
+		if (size < 4 || ((char *)value)[size - 1])
+			goto out;
+
+		/*
+		 * The input value must be string in form 0x%x with last zero
+		 * This means that the 'size' must be 4, 5, ...
+		 *  E.g: 0x1 - 4 bytes, 0x20 - 5 bytes
+		 */
+		if (sscanf((char *)value, "0x%x", &attrib) != 1)
+			goto out;
+
+		new_fa = cpu_to_le32(attrib);
+set_new_fa:
+		/*
+		 * Thanks Mark Harmstone:
+		 * keep directory bit consistency
+		 */
+		if (S_ISDIR(inode->i_mode))
+			new_fa |= FILE_ATTRIBUTE_DIRECTORY;
+		else
+			new_fa &= ~FILE_ATTRIBUTE_DIRECTORY;
+
+		if (ni->std_fa != new_fa) {
+			ni->std_fa = new_fa;
+			if (new_fa & FILE_ATTRIBUTE_READONLY)
+				inode->i_mode &= ~0222;
+			else
+				inode->i_mode |= 0222;
+			/* std attribute always in primary record */
+			ni->mi.dirty = true;
+			mark_inode_dirty(inode);
+		}
+		err = 0;
+
+		goto out;
+	}
+
+	if (name_len == sizeof(SYSTEM_NTFS_SECURITY) - 1 &&
+	    !memcmp(name, SYSTEM_NTFS_SECURITY, sizeof(SYSTEM_NTFS_SECURITY))) {
+		/* system.ntfs_security*/
+		__le32 security_id;
+		bool inserted;
+		struct ATTR_STD_INFO5 *std;
+
+		if (!is_ntfs3(ni->mi.sbi)) {
+			/*
+			 * we should replace ATTR_SECURE
+			 * Skip this way cause it is nt4 feature
+			 */
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (!is_sd_valid(value, size)) {
+			err = -EINVAL;
+			ntfs_inode_warn(
+				inode,
+				"you try to set invalid security descriptor");
+			goto out;
+		}
+
+		err = ntfs_insert_security(ni->mi.sbi, value, size,
+					   &security_id, &inserted);
+		if (err)
+			goto out;
+
+		ni_lock(ni);
+		std = ni_std5(ni);
+		if (!std) {
+			err = -EINVAL;
+		} else if (std->security_id != security_id) {
+			std->security_id = ni->std_security_id = security_id;
+			/* std attribute always in primary record */
+			ni->mi.dirty = true;
+			mark_inode_dirty(&ni->vfs_inode);
+		}
+		ni_unlock(ni);
+		goto out;
+	}
+
+#ifdef CONFIG_NTFS3_FS_POSIX_ACL
+	if ((name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1 &&
+	     !memcmp(name, XATTR_NAME_POSIX_ACL_ACCESS,
+		     sizeof(XATTR_NAME_POSIX_ACL_ACCESS))) ||
+	    (name_len == sizeof(XATTR_NAME_POSIX_ACL_DEFAULT) - 1 &&
+	     !memcmp(name, XATTR_NAME_POSIX_ACL_DEFAULT,
+		     sizeof(XATTR_NAME_POSIX_ACL_DEFAULT)))) {
+		err = ntfs_xattr_set_acl(
+			inode,
+			name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1 ?
+				ACL_TYPE_ACCESS :
+				ACL_TYPE_DEFAULT,
+			value, size);
+		goto out;
+	}
+#endif
+	/* deal with ntfs extended attribute */
+	err = ntfs_set_ea(inode, name, value, size, flags, 0);
+
+out:
+	return err;
+}
+
+static bool ntfs_xattr_user_list(struct dentry *dentry)
+{
+	return 1;
+}
+
+static const struct xattr_handler ntfs_xattr_handler = {
+	.prefix = "",
+	.get = ntfs_getxattr,
+	.set = ntfs_setxattr,
+	.list = ntfs_xattr_user_list,
+};
+
+const struct xattr_handler *ntfs_xattr_handlers[] = {
+	&ntfs_xattr_handler,
+	NULL,
+};

From patchwork Thu Jan 28 09:04:51 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053213
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-23.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,
	URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6EFE3C433E0
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:44 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 111DA64DDF
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:44 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232149AbhA1JIS (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:18 -0500
Received: from relayfre-01.paragon-software.com ([176.12.100.13]:47882 "EHLO
        relayfre-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231838AbhA1JGR (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:17 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relayfre-01.paragon-software.com (Postfix) with ESMTPS id
 5F00D1F86;
        Thu, 28 Jan 2021 12:05:04 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824704;
        bh=5MrskzEWYdIQwUYeFlQxfNZPXnrbm3pUehgrJmesp60=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=AGRMQqHucj3mCSB+HqLRrXGT36SNErR2OjzN3UvOh2Dyg9LPoqIhFUUldg9oC/6SU
         ZV+5B3qe05uPZpCFuU1EZMRs9TnIDAEEMDySMwOm5SZ608THSm6ljkTMYgnN70KDi7
         quKZsWMbTteMC/ZSIFZuiG/e/xbzk3NJseQ/8DsA=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:03 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 06/10] fs/ntfs3: Add compression
Date: Thu, 28 Jan 2021 12:04:51 +0300
Message-ID: 
 <20210128090455.3576502-7-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This patch adds different types of NTFS-applicable compressions:
- lznt
- lzx
- xpress
Latter two (lzx, xpress) implement Windows Compact OS feature and
were taken from ntfs-3g system comression plugin authored by Eric Biggers
(https://github.com/ebiggers/ntfs-3g-system-compression)
which were ported to ntfs3 and adapted to Linux Kernel environment.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/ntfs3/lib/decompress_common.c | 332 +++++++++++++++
 fs/ntfs3/lib/decompress_common.h | 352 ++++++++++++++++
 fs/ntfs3/lib/lib.h               |  26 ++
 fs/ntfs3/lib/lzx_decompress.c    | 683 +++++++++++++++++++++++++++++++
 fs/ntfs3/lib/xpress_decompress.c | 155 +++++++
 fs/ntfs3/lznt.c                  | 452 ++++++++++++++++++++
 6 files changed, 2000 insertions(+)
 create mode 100644 fs/ntfs3/lib/decompress_common.c
 create mode 100644 fs/ntfs3/lib/decompress_common.h
 create mode 100644 fs/ntfs3/lib/lib.h
 create mode 100644 fs/ntfs3/lib/lzx_decompress.c
 create mode 100644 fs/ntfs3/lib/xpress_decompress.c
 create mode 100644 fs/ntfs3/lznt.c

diff --git a/fs/ntfs3/lib/decompress_common.c b/fs/ntfs3/lib/decompress_common.c
new file mode 100644
index 000000000000..83c9e93aea77
--- /dev/null
+++ b/fs/ntfs3/lib/decompress_common.c
@@ -0,0 +1,332 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * decompress_common.c - Code shared by the XPRESS and LZX decompressors
+ *
+ * Copyright (C) 2015 Eric Biggers
+ *
+ * This program is free software: you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License as published by the Free Software
+ * Foundation, either version 2 of the License, or (at your option) any later
+ * version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "decompress_common.h"
+
+/*
+ * make_huffman_decode_table() -
+ *
+ * Build a decoding table for a canonical prefix code, or "Huffman code".
+ *
+ * This is an internal function, not part of the library API!
+ *
+ * This takes as input the length of the codeword for each symbol in the
+ * alphabet and produces as output a table that can be used for fast
+ * decoding of prefix-encoded symbols using read_huffsym().
+ *
+ * Strictly speaking, a canonical prefix code might not be a Huffman
+ * code.  But this algorithm will work either way; and in fact, since
+ * Huffman codes are defined in terms of symbol frequencies, there is no
+ * way for the decompressor to know whether the code is a true Huffman
+ * code or not until all symbols have been decoded.
+ *
+ * Because the prefix code is assumed to be "canonical", it can be
+ * reconstructed directly from the codeword lengths.  A prefix code is
+ * canonical if and only if a longer codeword never lexicographically
+ * precedes a shorter codeword, and the lexicographic ordering of
+ * codewords of the same length is the same as the lexicographic ordering
+ * of the corresponding symbols.  Consequently, we can sort the symbols
+ * primarily by codeword length and secondarily by symbol value, then
+ * reconstruct the prefix code by generating codewords lexicographically
+ * in that order.
+ *
+ * This function does not, however, generate the prefix code explicitly.
+ * Instead, it directly builds a table for decoding symbols using the
+ * code.  The basic idea is this: given the next 'max_codeword_len' bits
+ * in the input, we can look up the decoded symbol by indexing a table
+ * containing 2**max_codeword_len entries.  A codeword with length
+ * 'max_codeword_len' will have exactly one entry in this table, whereas
+ * a codeword shorter than 'max_codeword_len' will have multiple entries
+ * in this table.  Precisely, a codeword of length n will be represented
+ * by 2**(max_codeword_len - n) entries in this table.  The 0-based index
+ * of each such entry will contain the corresponding codeword as a prefix
+ * when zero-padded on the left to 'max_codeword_len' binary digits.
+ *
+ * That's the basic idea, but we implement two optimizations regarding
+ * the format of the decode table itself:
+ *
+ * - For many compression formats, the maximum codeword length is too
+ *   long for it to be efficient to build the full decoding table
+ *   whenever a new prefix code is used.  Instead, we can build the table
+ *   using only 2**table_bits entries, where 'table_bits' is some number
+ *   less than or equal to 'max_codeword_len'.  Then, only codewords of
+ *   length 'table_bits' and shorter can be directly looked up.  For
+ *   longer codewords, the direct lookup instead produces the root of a
+ *   binary tree.  Using this tree, the decoder can do traditional
+ *   bit-by-bit decoding of the remainder of the codeword.  Child nodes
+ *   are allocated in extra entries at the end of the table; leaf nodes
+ *   contain symbols.  Note that the long-codeword case is, in general,
+ *   not performance critical, since in Huffman codes the most frequently
+ *   used symbols are assigned the shortest codeword lengths.
+ *
+ * - When we decode a symbol using a direct lookup of the table, we still
+ *   need to know its length so that the bitstream can be advanced by the
+ *   appropriate number of bits.  The simple solution is to simply retain
+ *   the 'lens' array and use the decoded symbol as an index into it.
+ *   However, this requires two separate array accesses in the fast path.
+ *   The optimization is to store the length directly in the decode
+ *   table.  We use the bottom 11 bits for the symbol and the top 5 bits
+ *   for the length.  In addition, to combine this optimization with the
+ *   previous one, we introduce a special case where the top 2 bits of
+ *   the length are both set if the entry is actually the root of a
+ *   binary tree.
+ *
+ * @decode_table:
+ *	The array in which to create the decoding table.  This must have
+ *	a length of at least ((2**table_bits) + 2 * num_syms) entries.
+ *
+ * @num_syms:
+ *	The number of symbols in the alphabet; also, the length of the
+ *	'lens' array.  Must be less than or equal to 2048.
+ *
+ * @table_bits:
+ *	The order of the decode table size, as explained above.  Must be
+ *	less than or equal to 13.
+ *
+ * @lens:
+ *	An array of length @num_syms, indexable by symbol, that gives the
+ *	length of the codeword, in bits, for that symbol.  The length can
+ *	be 0, which means that the symbol does not have a codeword
+ *	assigned.
+ *
+ * @max_codeword_len:
+ *	The longest codeword length allowed in the compression format.
+ *	All entries in 'lens' must be less than or equal to this value.
+ *	This must be less than or equal to 23.
+ *
+ * @working_space
+ *	A temporary array of length '2 * (max_codeword_len + 1) +
+ *	num_syms'.
+ *
+ * Returns 0 on success, or -1 if the lengths do not form a valid prefix
+ * code.
+ */
+int make_huffman_decode_table(u16 decode_table[], const u32 num_syms,
+			      const u32 table_bits, const u8 lens[],
+			      const u32 max_codeword_len,
+			      u16 working_space[])
+{
+	const u32 table_num_entries = 1 << table_bits;
+	u16 * const len_counts = &working_space[0];
+	u16 * const offsets = &working_space[1 * (max_codeword_len + 1)];
+	u16 * const sorted_syms = &working_space[2 * (max_codeword_len + 1)];
+	int left;
+	void *decode_table_ptr;
+	u32 sym_idx;
+	u32 codeword_len;
+	u32 stores_per_loop;
+	u32 decode_table_pos;
+	u32 len;
+	u32 sym;
+
+	/* Count how many symbols have each possible codeword length.
+	 * Note that a length of 0 indicates the corresponding symbol is not
+	 * used in the code and therefore does not have a codeword.
+	 */
+	for (len = 0; len <= max_codeword_len; len++)
+		len_counts[len] = 0;
+	for (sym = 0; sym < num_syms; sym++)
+		len_counts[lens[sym]]++;
+
+	/* We can assume all lengths are <= max_codeword_len, but we
+	 * cannot assume they form a valid prefix code.  A codeword of
+	 * length n should require a proportion of the codespace equaling
+	 * (1/2)^n.  The code is valid if and only if the codespace is
+	 * exactly filled by the lengths, by this measure.
+	 */
+	left = 1;
+	for (len = 1; len <= max_codeword_len; len++) {
+		left <<= 1;
+		left -= len_counts[len];
+		if (left < 0) {
+			/* The lengths overflow the codespace; that is, the code
+			 * is over-subscribed.
+			 */
+			return -1;
+		}
+	}
+
+	if (left) {
+		/* The lengths do not fill the codespace; that is, they form an
+		 * incomplete set.
+		 */
+		if (left == (1 << max_codeword_len)) {
+			/* The code is completely empty.  This is arguably
+			 * invalid, but in fact it is valid in LZX and XPRESS,
+			 * so we must allow it.  By definition, no symbols can
+			 * be decoded with an empty code.  Consequently, we
+			 * technically don't even need to fill in the decode
+			 * table.  However, to avoid accessing uninitialized
+			 * memory if the algorithm nevertheless attempts to
+			 * decode symbols using such a code, we zero out the
+			 * decode table.
+			 */
+			memset(decode_table, 0,
+			       table_num_entries * sizeof(decode_table[0]));
+			return 0;
+		}
+		return -1;
+	}
+
+	/* Sort the symbols primarily by length and secondarily by symbol order.
+	 */
+
+	/* Initialize 'offsets' so that offsets[len] for 1 <= len <=
+	 * max_codeword_len is the number of codewords shorter than 'len' bits.
+	 */
+	offsets[1] = 0;
+	for (len = 1; len < max_codeword_len; len++)
+		offsets[len + 1] = offsets[len] + len_counts[len];
+
+	/* Use the 'offsets' array to sort the symbols.  Note that we do not
+	 * include symbols that are not used in the code.  Consequently, fewer
+	 * than 'num_syms' entries in 'sorted_syms' may be filled.
+	 */
+	for (sym = 0; sym < num_syms; sym++)
+		if (lens[sym])
+			sorted_syms[offsets[lens[sym]]++] = sym;
+
+	/* Fill entries for codewords with length <= table_bits
+	 * --- that is, those short enough for a direct mapping.
+	 *
+	 * The table will start with entries for the shortest codeword(s), which
+	 * have the most entries.  From there, the number of entries per
+	 * codeword will decrease.
+	 */
+	decode_table_ptr = decode_table;
+	sym_idx = 0;
+	codeword_len = 1;
+	stores_per_loop = (1 << (table_bits - codeword_len));
+	for (; stores_per_loop != 0; codeword_len++, stores_per_loop >>= 1) {
+		u32 end_sym_idx = sym_idx + len_counts[codeword_len];
+
+		for (; sym_idx < end_sym_idx; sym_idx++) {
+			u16 entry;
+			u16 *p;
+			u32 n;
+
+			entry = ((u32)codeword_len << 11) | sorted_syms[sym_idx];
+			p = (u16 *)decode_table_ptr;
+			n = stores_per_loop;
+
+			do {
+				*p++ = entry;
+			} while (--n);
+
+			decode_table_ptr = p;
+		}
+	}
+
+	/* If we've filled in the entire table, we are done.  Otherwise,
+	 * there are codewords longer than table_bits for which we must
+	 * generate binary trees.
+	 */
+	decode_table_pos = (u16 *)decode_table_ptr - decode_table;
+	if (decode_table_pos != table_num_entries) {
+		u32 j;
+		u32 next_free_tree_slot;
+		u32 cur_codeword;
+
+		/* First, zero out the remaining entries.  This is
+		 * necessary so that these entries appear as
+		 * "unallocated" in the next part.  Each of these entries
+		 * will eventually be filled with the representation of
+		 * the root node of a binary tree.
+		 */
+		j = decode_table_pos;
+		do {
+			decode_table[j] = 0;
+		} while (++j != table_num_entries);
+
+		/* We allocate child nodes starting at the end of the
+		 * direct lookup table.  Note that there should be
+		 * 2*num_syms extra entries for this purpose, although
+		 * fewer than this may actually be needed.
+		 */
+		next_free_tree_slot = table_num_entries;
+
+		/* Iterate through each codeword with length greater than
+		 * 'table_bits', primarily in order of codeword length
+		 * and secondarily in order of symbol.
+		 */
+		for (cur_codeword = decode_table_pos << 1;
+		     codeword_len <= max_codeword_len;
+		     codeword_len++, cur_codeword <<= 1) {
+			u32 end_sym_idx = sym_idx + len_counts[codeword_len];
+
+			for (; sym_idx < end_sym_idx; sym_idx++, cur_codeword++) {
+				/* 'sorted_sym' is the symbol represented by the
+				 * codeword.
+				 */
+				u32 sorted_sym = sorted_syms[sym_idx];
+				u32 extra_bits = codeword_len - table_bits;
+				u32 node_idx = cur_codeword >> extra_bits;
+
+				/* Go through each bit of the current codeword
+				 * beyond the prefix of length @table_bits and
+				 * walk the appropriate binary tree, allocating
+				 * any slots that have not yet been allocated.
+				 *
+				 * Note that the 'pointer' entry to the binary
+				 * tree, which is stored in the direct lookup
+				 * portion of the table, is represented
+				 * identically to other internal (non-leaf)
+				 * nodes of the binary tree; it can be thought
+				 * of as simply the root of the tree.  The
+				 * representation of these internal nodes is
+				 * simply the index of the left child combined
+				 * with the special bits 0xC000 to distingush
+				 * the entry from direct mapping and leaf node
+				 * entries.
+				 */
+				do {
+					/* At least one bit remains in the
+					 * codeword, but the current node is an
+					 * unallocated leaf.  Change it to an
+					 * internal node.
+					 */
+					if (decode_table[node_idx] == 0) {
+						decode_table[node_idx] =
+							next_free_tree_slot | 0xC000;
+						decode_table[next_free_tree_slot++] = 0;
+						decode_table[next_free_tree_slot++] = 0;
+					}
+
+					/* Go to the left child if the next bit
+					 * in the codeword is 0; otherwise go to
+					 * the right child.
+					 */
+					node_idx = decode_table[node_idx] & 0x3FFF;
+					--extra_bits;
+					node_idx += (cur_codeword >> extra_bits) & 1;
+				} while (extra_bits != 0);
+
+				/* We've traversed the tree using the entire
+				 * codeword, and we're now at the entry where
+				 * the actual symbol will be stored.  This is
+				 * distinguished from internal nodes by not
+				 * having its high two bits set.
+				 */
+				decode_table[node_idx] = sorted_sym;
+			}
+		}
+	}
+	return 0;
+}
diff --git a/fs/ntfs3/lib/decompress_common.h b/fs/ntfs3/lib/decompress_common.h
new file mode 100644
index 000000000000..66297f398403
--- /dev/null
+++ b/fs/ntfs3/lib/decompress_common.h
@@ -0,0 +1,352 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+/*
+ * decompress_common.h - Code shared by the XPRESS and LZX decompressors
+ *
+ * Copyright (C) 2015 Eric Biggers
+ *
+ * This program is free software: you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License as published by the Free Software
+ * Foundation, either version 2 of the License, or (at your option) any later
+ * version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/string.h>
+#include <linux/compiler.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <asm/unaligned.h>
+
+
+/* "Force inline" macro (not required, but helpful for performance)  */
+#define forceinline __always_inline
+
+/* Enable whole-word match copying on selected architectures  */
+#if defined(__i386__) || defined(__x86_64__) || defined(__ARM_FEATURE_UNALIGNED)
+#  define FAST_UNALIGNED_ACCESS
+#endif
+
+/* Size of a machine word  */
+#define WORDBYTES (sizeof(size_t))
+
+static forceinline void
+copy_unaligned_word(const void *src, void *dst)
+{
+	put_unaligned(get_unaligned((const size_t *)src), (size_t *)dst);
+}
+
+
+/* Generate a "word" with platform-dependent size whose bytes all contain the
+ * value 'b'.
+ */
+static forceinline size_t repeat_byte(u8 b)
+{
+	size_t v;
+
+	v = b;
+	v |= v << 8;
+	v |= v << 16;
+	v |= v << ((WORDBYTES == 8) ? 32 : 0);
+	return v;
+}
+
+/* Structure that encapsulates a block of in-memory data being interpreted as a
+ * stream of bits, optionally with interwoven literal bytes.  Bits are assumed
+ * to be stored in little endian 16-bit coding units, with the bits ordered high
+ * to low.
+ */
+struct input_bitstream {
+
+	/* Bits that have been read from the input buffer.  The bits are
+	 * left-justified; the next bit is always bit 31.
+	 */
+	u32 bitbuf;
+
+	/* Number of bits currently held in @bitbuf.  */
+	u32 bitsleft;
+
+	/* Pointer to the next byte to be retrieved from the input buffer.  */
+	const u8 *next;
+
+	/* Pointer to just past the end of the input buffer.  */
+	const u8 *end;
+};
+
+/* Initialize a bitstream to read from the specified input buffer.  */
+static forceinline void init_input_bitstream(struct input_bitstream *is,
+					     const void *buffer, u32 size)
+{
+	is->bitbuf = 0;
+	is->bitsleft = 0;
+	is->next = buffer;
+	is->end = is->next + size;
+}
+
+/* Ensure the bit buffer variable for the bitstream contains at least @num_bits
+ * bits.  Following this, bitstream_peek_bits() and/or bitstream_remove_bits()
+ * may be called on the bitstream to peek or remove up to @num_bits bits.  Note
+ * that @num_bits must be <= 16.
+ */
+static forceinline void bitstream_ensure_bits(struct input_bitstream *is,
+					      u32 num_bits)
+{
+	if (is->bitsleft < num_bits) {
+		if (is->end - is->next >= 2) {
+			is->bitbuf |= (u32)get_unaligned_le16(is->next)
+					<< (16 - is->bitsleft);
+			is->next += 2;
+		}
+		is->bitsleft += 16;
+	}
+}
+
+/* Return the next @num_bits bits from the bitstream, without removing them.
+ * There must be at least @num_bits remaining in the buffer variable, from a
+ * previous call to bitstream_ensure_bits().
+ */
+static forceinline u32
+bitstream_peek_bits(const struct input_bitstream *is, const u32 num_bits)
+{
+	return (is->bitbuf >> 1) >> (sizeof(is->bitbuf) * 8 - num_bits - 1);
+}
+
+/* Remove @num_bits from the bitstream.  There must be at least @num_bits
+ * remaining in the buffer variable, from a previous call to
+ * bitstream_ensure_bits().
+ */
+static forceinline void
+bitstream_remove_bits(struct input_bitstream *is, u32 num_bits)
+{
+	is->bitbuf <<= num_bits;
+	is->bitsleft -= num_bits;
+}
+
+/* Remove and return @num_bits bits from the bitstream.  There must be at least
+ * @num_bits remaining in the buffer variable, from a previous call to
+ * bitstream_ensure_bits().
+ */
+static forceinline u32
+bitstream_pop_bits(struct input_bitstream *is, u32 num_bits)
+{
+	u32 bits = bitstream_peek_bits(is, num_bits);
+
+	bitstream_remove_bits(is, num_bits);
+	return bits;
+}
+
+/* Read and return the next @num_bits bits from the bitstream.  */
+static forceinline u32
+bitstream_read_bits(struct input_bitstream *is, u32 num_bits)
+{
+	bitstream_ensure_bits(is, num_bits);
+	return bitstream_pop_bits(is, num_bits);
+}
+
+/* Read and return the next literal byte embedded in the bitstream.  */
+static forceinline u8
+bitstream_read_byte(struct input_bitstream *is)
+{
+	if (unlikely(is->end == is->next))
+		return 0;
+	return *is->next++;
+}
+
+/* Read and return the next 16-bit integer embedded in the bitstream.  */
+static forceinline u16
+bitstream_read_u16(struct input_bitstream *is)
+{
+	u16 v;
+
+	if (unlikely(is->end - is->next < 2))
+		return 0;
+	v = get_unaligned_le16(is->next);
+	is->next += 2;
+	return v;
+}
+
+/* Read and return the next 32-bit integer embedded in the bitstream.  */
+static forceinline u32
+bitstream_read_u32(struct input_bitstream *is)
+{
+	u32 v;
+
+	if (unlikely(is->end - is->next < 4))
+		return 0;
+	v = get_unaligned_le32(is->next);
+	is->next += 4;
+	return v;
+}
+
+/* Read into @dst_buffer an array of literal bytes embedded in the bitstream.
+ * Return either a pointer to the byte past the last written, or NULL if the
+ * read overflows the input buffer.
+ */
+static forceinline void *bitstream_read_bytes(struct input_bitstream *is,
+					      void *dst_buffer, size_t count)
+{
+	if ((size_t)(is->end - is->next) < count)
+		return NULL;
+	memcpy(dst_buffer, is->next, count);
+	is->next += count;
+	return (u8 *)dst_buffer + count;
+}
+
+/* Align the input bitstream on a coding-unit boundary.  */
+static forceinline void bitstream_align(struct input_bitstream *is)
+{
+	is->bitsleft = 0;
+	is->bitbuf = 0;
+}
+
+extern int make_huffman_decode_table(u16 decode_table[], const u32 num_syms,
+				     const u32 num_bits, const u8 lens[],
+				     const u32 max_codeword_len,
+				     u16 working_space[]);
+
+
+/* Reads and returns the next Huffman-encoded symbol from a bitstream.  If the
+ * input data is exhausted, the Huffman symbol is decoded as if the missing bits
+ * are all zeroes.
+ */
+static forceinline u32 read_huffsym(struct input_bitstream *istream,
+					 const u16 decode_table[],
+					 u32 table_bits,
+					 u32 max_codeword_len)
+{
+	u32 entry;
+	u32 key_bits;
+
+	bitstream_ensure_bits(istream, max_codeword_len);
+
+	/* Index the decode table by the next table_bits bits of the input.  */
+	key_bits = bitstream_peek_bits(istream, table_bits);
+	entry = decode_table[key_bits];
+	if (entry < 0xC000) {
+		/* Fast case: The decode table directly provided the
+		 * symbol and codeword length.  The low 11 bits are the
+		 * symbol, and the high 5 bits are the codeword length.
+		 */
+		bitstream_remove_bits(istream, entry >> 11);
+		return entry & 0x7FF;
+	}
+	/* Slow case: The codeword for the symbol is longer than
+	 * table_bits, so the symbol does not have an entry
+	 * directly in the first (1 << table_bits) entries of the
+	 * decode table.  Traverse the appropriate binary tree
+	 * bit-by-bit to decode the symbol.
+	 */
+	bitstream_remove_bits(istream, table_bits);
+	do {
+		key_bits = (entry & 0x3FFF) + bitstream_pop_bits(istream, 1);
+	} while ((entry = decode_table[key_bits]) >= 0xC000);
+	return entry;
+}
+
+/*
+ * Copy an LZ77 match at (dst - offset) to dst.
+ *
+ * The length and offset must be already validated --- that is, (dst - offset)
+ * can't underrun the output buffer, and (dst + length) can't overrun the output
+ * buffer.  Also, the length cannot be 0.
+ *
+ * @bufend points to the byte past the end of the output buffer.  This function
+ * won't write any data beyond this position.
+ *
+ * Returns dst + length.
+ */
+static forceinline u8 *lz_copy(u8 *dst, u32 length, u32 offset, const u8 *bufend,
+			       u32 min_length)
+{
+	const u8 *src = dst - offset;
+
+	/*
+	 * Try to copy one machine word at a time.  On i386 and x86_64 this is
+	 * faster than copying one byte at a time, unless the data is
+	 * near-random and all the matches have very short lengths.  Note that
+	 * since this requires unaligned memory accesses, it won't necessarily
+	 * be faster on every architecture.
+	 *
+	 * Also note that we might copy more than the length of the match.  For
+	 * example, if a word is 8 bytes and the match is of length 5, then
+	 * we'll simply copy 8 bytes.  This is okay as long as we don't write
+	 * beyond the end of the output buffer, hence the check for (bufend -
+	 * end >= WORDBYTES - 1).
+	 */
+#ifdef FAST_UNALIGNED_ACCESS
+	u8 * const end = dst + length;
+
+	if (bufend - end >= (ptrdiff_t)(WORDBYTES - 1)) {
+
+		if (offset >= WORDBYTES) {
+			/* The source and destination words don't overlap.  */
+
+			/* To improve branch prediction, one iteration of this
+			 * loop is unrolled.  Most matches are short and will
+			 * fail the first check.  But if that check passes, then
+			 * it becomes increasing likely that the match is long
+			 * and we'll need to continue copying.
+			 */
+
+			copy_unaligned_word(src, dst);
+			src += WORDBYTES;
+			dst += WORDBYTES;
+
+			if (dst < end) {
+				do {
+					copy_unaligned_word(src, dst);
+					src += WORDBYTES;
+					dst += WORDBYTES;
+				} while (dst < end);
+			}
+			return end;
+		} else if (offset == 1) {
+
+			/* Offset 1 matches are equivalent to run-length
+			 * encoding of the previous byte.  This case is common
+			 * if the data contains many repeated bytes.
+			 */
+			size_t v = repeat_byte(*(dst - 1));
+
+			do {
+				put_unaligned(v, (size_t *)dst);
+				src += WORDBYTES;
+				dst += WORDBYTES;
+			} while (dst < end);
+			return end;
+		}
+		/*
+		 * We don't bother with special cases for other 'offset <
+		 * WORDBYTES', which are usually rarer than 'offset == 1'.  Extra
+		 * checks will just slow things down.  Actually, it's possible
+		 * to handle all the 'offset < WORDBYTES' cases using the same
+		 * code, but it still becomes more complicated doesn't seem any
+		 * faster overall; it definitely slows down the more common
+		 * 'offset == 1' case.
+		 */
+	}
+#endif /* FAST_UNALIGNED_ACCESS */
+
+	/* Fall back to a bytewise copy.  */
+
+	if (min_length >= 2) {
+		*dst++ = *src++;
+		length--;
+	}
+	if (min_length >= 3) {
+		*dst++ = *src++;
+		length--;
+	}
+	do {
+		*dst++ = *src++;
+	} while (--length);
+
+	return dst;
+}
diff --git a/fs/ntfs3/lib/lib.h b/fs/ntfs3/lib/lib.h
new file mode 100644
index 000000000000..f508fbad2e71
--- /dev/null
+++ b/fs/ntfs3/lib/lib.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Adapted for linux kernel by Alexander Mamaev:
+ * - remove implementations of get_unaligned_
+ * - assume GCC is always defined
+ * - ISO C90
+ * - linux kernel code style
+ */
+
+
+/* globals from xpress_decompress.c */
+struct xpress_decompressor *xpress_allocate_decompressor(void);
+void xpress_free_decompressor(struct xpress_decompressor *d);
+int xpress_decompress(struct xpress_decompressor *__restrict d,
+		      const void *__restrict compressed_data,
+		      size_t compressed_size,
+		      void *__restrict uncompressed_data,
+		      size_t uncompressed_size);
+
+/* globals from lzx_decompress.c */
+struct lzx_decompressor *lzx_allocate_decompressor(void);
+void lzx_free_decompressor(struct lzx_decompressor *d);
+int lzx_decompress(struct lzx_decompressor *__restrict d,
+		   const void *__restrict compressed_data,
+		   size_t compressed_size, void *__restrict uncompressed_data,
+		   size_t uncompressed_size);
diff --git a/fs/ntfs3/lib/lzx_decompress.c b/fs/ntfs3/lib/lzx_decompress.c
new file mode 100644
index 000000000000..77a381a693d1
--- /dev/null
+++ b/fs/ntfs3/lib/lzx_decompress.c
@@ -0,0 +1,683 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * lzx_decompress.c - A decompressor for the LZX compression format, which can
+ * be used in "System Compressed" files.  This is based on the code from wimlib.
+ * This code only supports a window size (dictionary size) of 32768 bytes, since
+ * this is the only size used in System Compression.
+ *
+ * Copyright (C) 2015 Eric Biggers
+ *
+ * This program is free software: you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License as published by the Free Software
+ * Foundation, either version 2 of the License, or (at your option) any later
+ * version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "decompress_common.h"
+#include "lib.h"
+
+/* Number of literal byte values  */
+#define LZX_NUM_CHARS			256
+
+/* The smallest and largest allowed match lengths  */
+#define LZX_MIN_MATCH_LEN		2
+#define LZX_MAX_MATCH_LEN		257
+
+/* Number of distinct match lengths that can be represented  */
+#define LZX_NUM_LENS			(LZX_MAX_MATCH_LEN - LZX_MIN_MATCH_LEN + 1)
+
+/* Number of match lengths for which no length symbol is required  */
+#define LZX_NUM_PRIMARY_LENS		7
+#define LZX_NUM_LEN_HEADERS		(LZX_NUM_PRIMARY_LENS + 1)
+
+/* Valid values of the 3-bit block type field  */
+#define LZX_BLOCKTYPE_VERBATIM		1
+#define LZX_BLOCKTYPE_ALIGNED		2
+#define LZX_BLOCKTYPE_UNCOMPRESSED	3
+
+/* Number of offset slots for a window size of 32768  */
+#define LZX_NUM_OFFSET_SLOTS		30
+
+/* Number of symbols in the main code for a window size of 32768  */
+#define LZX_MAINCODE_NUM_SYMBOLS	\
+	(LZX_NUM_CHARS + (LZX_NUM_OFFSET_SLOTS * LZX_NUM_LEN_HEADERS))
+
+/* Number of symbols in the length code  */
+#define LZX_LENCODE_NUM_SYMBOLS		(LZX_NUM_LENS - LZX_NUM_PRIMARY_LENS)
+
+/* Number of symbols in the precode  */
+#define LZX_PRECODE_NUM_SYMBOLS		20
+
+/* Number of bits in which each precode codeword length is represented  */
+#define LZX_PRECODE_ELEMENT_SIZE	4
+
+/* Number of low-order bits of each match offset that are entropy-encoded in
+ * aligned offset blocks
+ */
+#define LZX_NUM_ALIGNED_OFFSET_BITS	3
+
+/* Number of symbols in the aligned offset code  */
+#define LZX_ALIGNEDCODE_NUM_SYMBOLS	(1 << LZX_NUM_ALIGNED_OFFSET_BITS)
+
+/* Mask for the match offset bits that are entropy-encoded in aligned offset
+ * blocks
+ */
+#define LZX_ALIGNED_OFFSET_BITMASK	((1 << LZX_NUM_ALIGNED_OFFSET_BITS) - 1)
+
+/* Number of bits in which each aligned offset codeword length is represented  */
+#define LZX_ALIGNEDCODE_ELEMENT_SIZE	3
+
+/* Maximum lengths (in bits) of the codewords in each Huffman code  */
+#define LZX_MAX_MAIN_CODEWORD_LEN	16
+#define LZX_MAX_LEN_CODEWORD_LEN	16
+#define LZX_MAX_PRE_CODEWORD_LEN	((1 << LZX_PRECODE_ELEMENT_SIZE) - 1)
+#define LZX_MAX_ALIGNED_CODEWORD_LEN	((1 << LZX_ALIGNEDCODE_ELEMENT_SIZE) - 1)
+
+/* The default "filesize" value used in pre/post-processing.  In the LZX format
+ * used in cabinet files this value must be given to the decompressor, whereas
+ * in the LZX format used in WIM files and system-compressed files this value is
+ * fixed at 12000000.
+ */
+#define LZX_DEFAULT_FILESIZE		12000000
+
+/* Assumed block size when the encoded block size begins with a 0 bit.  */
+#define LZX_DEFAULT_BLOCK_SIZE		32768
+
+/* Number of offsets in the recent (or "repeat") offsets queue.  */
+#define LZX_NUM_RECENT_OFFSETS		3
+
+/* These values are chosen for fast decompression.  */
+#define LZX_MAINCODE_TABLEBITS		11
+#define LZX_LENCODE_TABLEBITS		10
+#define LZX_PRECODE_TABLEBITS		6
+#define LZX_ALIGNEDCODE_TABLEBITS	7
+
+#define LZX_READ_LENS_MAX_OVERRUN	50
+
+/* Mapping: offset slot => first match offset that uses that offset slot.
+ */
+static const u32 lzx_offset_slot_base[LZX_NUM_OFFSET_SLOTS + 1] = {
+	0,	1,	2,	3,	4,	/* 0  --- 4  */
+	6,	8,	12,	16,	24,	/* 5  --- 9  */
+	32,	48,	64,	96,	128,	/* 10 --- 14 */
+	192,	256,	384,	512,	768,	/* 15 --- 19 */
+	1024,	1536,	2048,	3072,	4096,   /* 20 --- 24 */
+	6144,	8192,	12288,	16384,	24576,	/* 25 --- 29 */
+	32768,					/* extra     */
+};
+
+/* Mapping: offset slot => how many extra bits must be read and added to the
+ * corresponding offset slot base to decode the match offset.
+ */
+static const u8 lzx_extra_offset_bits[LZX_NUM_OFFSET_SLOTS] = {
+	0,	0,	0,	0,	1,
+	1,	2,	2,	3,	3,
+	4,	4,	5,	5,	6,
+	6,	7,	7,	8,	8,
+	9,	9,	10,	10,	11,
+	11,	12,	12,	13,	13,
+};
+
+/* Reusable heap-allocated memory for LZX decompression  */
+struct lzx_decompressor {
+
+	/* Huffman decoding tables, and arrays that map symbols to codeword
+	 * lengths
+	 */
+
+	u16 maincode_decode_table[(1 << LZX_MAINCODE_TABLEBITS) +
+					(LZX_MAINCODE_NUM_SYMBOLS * 2)];
+	u8 maincode_lens[LZX_MAINCODE_NUM_SYMBOLS + LZX_READ_LENS_MAX_OVERRUN];
+
+
+	u16 lencode_decode_table[(1 << LZX_LENCODE_TABLEBITS) +
+					(LZX_LENCODE_NUM_SYMBOLS * 2)];
+	u8 lencode_lens[LZX_LENCODE_NUM_SYMBOLS + LZX_READ_LENS_MAX_OVERRUN];
+
+
+	u16 alignedcode_decode_table[(1 << LZX_ALIGNEDCODE_TABLEBITS) +
+					(LZX_ALIGNEDCODE_NUM_SYMBOLS * 2)];
+	u8 alignedcode_lens[LZX_ALIGNEDCODE_NUM_SYMBOLS];
+
+	u16 precode_decode_table[(1 << LZX_PRECODE_TABLEBITS) +
+				 (LZX_PRECODE_NUM_SYMBOLS * 2)];
+	u8 precode_lens[LZX_PRECODE_NUM_SYMBOLS];
+
+	/* Temporary space for make_huffman_decode_table()  */
+	u16 working_space[2 * (1 + LZX_MAX_MAIN_CODEWORD_LEN) +
+			  LZX_MAINCODE_NUM_SYMBOLS];
+};
+
+static void undo_e8_translation(void *target, s32 input_pos)
+{
+	s32 abs_offset, rel_offset;
+
+	abs_offset = get_unaligned_le32(target);
+	if (abs_offset >= 0) {
+		if (abs_offset < LZX_DEFAULT_FILESIZE) {
+			/* "good translation" */
+			rel_offset = abs_offset - input_pos;
+			put_unaligned_le32(rel_offset, target);
+		}
+	} else {
+		if (abs_offset >= -input_pos) {
+			/* "compensating translation" */
+			rel_offset = abs_offset + LZX_DEFAULT_FILESIZE;
+			put_unaligned_le32(rel_offset, target);
+		}
+	}
+}
+
+/*
+ * Undo the 'E8' preprocessing used in LZX.  Before compression, the
+ * uncompressed data was preprocessed by changing the targets of suspected x86
+ * CALL instructions from relative offsets to absolute offsets.  After
+ * match/literal decoding, the decompressor must undo the translation.
+ */
+static void lzx_postprocess(u8 *data, u32 size)
+{
+	/*
+	 * A worthwhile optimization is to push the end-of-buffer check into the
+	 * relatively rare E8 case.  This is possible if we replace the last six
+	 * bytes of data with E8 bytes; then we are guaranteed to hit an E8 byte
+	 * before reaching end-of-buffer.  In addition, this scheme guarantees
+	 * that no translation can begin following an E8 byte in the last 10
+	 * bytes because a 4-byte offset containing E8 as its high byte is a
+	 * large negative number that is not valid for translation.  That is
+	 * exactly what we need.
+	 */
+	u8 *tail;
+	u8 saved_bytes[6];
+	u8 *p;
+
+	if (size <= 10)
+		return;
+
+	tail = &data[size - 6];
+	memcpy(saved_bytes, tail, 6);
+	memset(tail, 0xE8, 6);
+	p = data;
+	for (;;) {
+		while (*p != 0xE8)
+			p++;
+		if (p >= tail)
+			break;
+		undo_e8_translation(p + 1, p - data);
+		p += 5;
+	}
+	memcpy(tail, saved_bytes, 6);
+}
+
+/* Read a Huffman-encoded symbol using the precode.  */
+static forceinline u32 read_presym(const struct lzx_decompressor *d,
+					struct input_bitstream *is)
+{
+	return read_huffsym(is, d->precode_decode_table,
+			    LZX_PRECODE_TABLEBITS, LZX_MAX_PRE_CODEWORD_LEN);
+}
+
+/* Read a Huffman-encoded symbol using the main code.  */
+static forceinline u32 read_mainsym(const struct lzx_decompressor *d,
+					 struct input_bitstream *is)
+{
+	return read_huffsym(is, d->maincode_decode_table,
+			    LZX_MAINCODE_TABLEBITS, LZX_MAX_MAIN_CODEWORD_LEN);
+}
+
+/* Read a Huffman-encoded symbol using the length code.  */
+static forceinline u32 read_lensym(const struct lzx_decompressor *d,
+					struct input_bitstream *is)
+{
+	return read_huffsym(is, d->lencode_decode_table,
+			    LZX_LENCODE_TABLEBITS, LZX_MAX_LEN_CODEWORD_LEN);
+}
+
+/* Read a Huffman-encoded symbol using the aligned offset code.  */
+static forceinline u32 read_alignedsym(const struct lzx_decompressor *d,
+					    struct input_bitstream *is)
+{
+	return read_huffsym(is, d->alignedcode_decode_table,
+			    LZX_ALIGNEDCODE_TABLEBITS,
+			    LZX_MAX_ALIGNED_CODEWORD_LEN);
+}
+
+/*
+ * Read the precode from the compressed input bitstream, then use it to decode
+ * @num_lens codeword length values.
+ *
+ * @is:		The input bitstream.
+ *
+ * @lens:	An array that contains the length values from the previous time
+ *		the codeword lengths for this Huffman code were read, or all 0's
+ *		if this is the first time.  This array must have at least
+ *		(@num_lens + LZX_READ_LENS_MAX_OVERRUN) entries.
+ *
+ * @num_lens:	Number of length values to decode.
+ *
+ * Returns 0 on success, or -1 if the data was invalid.
+ */
+static int lzx_read_codeword_lens(struct lzx_decompressor *d,
+				  struct input_bitstream *is,
+				  u8 *lens, u32 num_lens)
+{
+	u8 *len_ptr = lens;
+	u8 *lens_end = lens + num_lens;
+	int i;
+
+	/* Read the lengths of the precode codewords.  These are given
+	 * explicitly.
+	 */
+	for (i = 0; i < LZX_PRECODE_NUM_SYMBOLS; i++) {
+		d->precode_lens[i] =
+			bitstream_read_bits(is, LZX_PRECODE_ELEMENT_SIZE);
+	}
+
+	/* Make the decoding table for the precode.  */
+	if (make_huffman_decode_table(d->precode_decode_table,
+				      LZX_PRECODE_NUM_SYMBOLS,
+				      LZX_PRECODE_TABLEBITS,
+				      d->precode_lens,
+				      LZX_MAX_PRE_CODEWORD_LEN,
+				      d->working_space))
+		return -1;
+
+	/* Decode the codeword lengths.  */
+	do {
+		u32 presym;
+		u8 len;
+
+		/* Read the next precode symbol.  */
+		presym = read_presym(d, is);
+		if (presym < 17) {
+			/* Difference from old length  */
+			len = *len_ptr - presym;
+			if ((s8)len < 0)
+				len += 17;
+			*len_ptr++ = len;
+		} else {
+			/* Special RLE values  */
+
+			u32 run_len;
+
+			if (presym == 17) {
+				/* Run of 0's  */
+				run_len = 4 + bitstream_read_bits(is, 4);
+				len = 0;
+			} else if (presym == 18) {
+				/* Longer run of 0's  */
+				run_len = 20 + bitstream_read_bits(is, 5);
+				len = 0;
+			} else {
+				/* Run of identical lengths  */
+				run_len = 4 + bitstream_read_bits(is, 1);
+				presym = read_presym(d, is);
+				if (presym > 17)
+					return -1;
+				len = *len_ptr - presym;
+				if ((s8)len < 0)
+					len += 17;
+			}
+
+			do {
+				*len_ptr++ = len;
+			} while (--run_len);
+			/* Worst case overrun is when presym == 18,
+			 * run_len == 20 + 31, and only 1 length was remaining.
+			 * So LZX_READ_LENS_MAX_OVERRUN == 50.
+			 *
+			 * Overrun while reading the first half of maincode_lens
+			 * can corrupt the previous values in the second half.
+			 * This doesn't really matter because the resulting
+			 * lengths will still be in range, and data that
+			 * generates overruns is invalid anyway.
+			 */
+		}
+	} while (len_ptr < lens_end);
+
+	return 0;
+}
+
+/*
+ * Read the header of an LZX block and save the block type and (uncompressed)
+ * size in *block_type_ret and *block_size_ret, respectively.
+ *
+ * If the block is compressed, also update the Huffman decode @tables with the
+ * new Huffman codes.  If the block is uncompressed, also update the match
+ * offset @queue with the new match offsets.
+ *
+ * Return 0 on success, or -1 if the data was invalid.
+ */
+static int lzx_read_block_header(struct lzx_decompressor *d,
+				 struct input_bitstream *is,
+				 int *block_type_ret,
+				 u32 *block_size_ret,
+				 u32 recent_offsets[])
+{
+	int block_type;
+	u32 block_size;
+	int i;
+
+	bitstream_ensure_bits(is, 4);
+
+	/* The first three bits tell us what kind of block it is, and should be
+	 * one of the LZX_BLOCKTYPE_* values.
+	 */
+	block_type = bitstream_pop_bits(is, 3);
+
+	/* Read the block size.  */
+	if (bitstream_pop_bits(is, 1)) {
+		block_size = LZX_DEFAULT_BLOCK_SIZE;
+	} else {
+		block_size = 0;
+		block_size |= bitstream_read_bits(is, 8);
+		block_size <<= 8;
+		block_size |= bitstream_read_bits(is, 8);
+	}
+
+	switch (block_type) {
+
+	case LZX_BLOCKTYPE_ALIGNED:
+
+		/* Read the aligned offset code and prepare its decode table.
+		 */
+
+		for (i = 0; i < LZX_ALIGNEDCODE_NUM_SYMBOLS; i++) {
+			d->alignedcode_lens[i] =
+				bitstream_read_bits(is,
+						    LZX_ALIGNEDCODE_ELEMENT_SIZE);
+		}
+
+		if (make_huffman_decode_table(d->alignedcode_decode_table,
+					      LZX_ALIGNEDCODE_NUM_SYMBOLS,
+					      LZX_ALIGNEDCODE_TABLEBITS,
+					      d->alignedcode_lens,
+					      LZX_MAX_ALIGNED_CODEWORD_LEN,
+					      d->working_space))
+			return -1;
+
+		/* Fall though, since the rest of the header for aligned offset
+		 * blocks is the same as that for verbatim blocks.
+		 */
+		fallthrough;
+
+	case LZX_BLOCKTYPE_VERBATIM:
+
+		/* Read the main code and prepare its decode table.
+		 *
+		 * Note that the codeword lengths in the main code are encoded
+		 * in two parts: one part for literal symbols, and one part for
+		 * match symbols.
+		 */
+
+		if (lzx_read_codeword_lens(d, is, d->maincode_lens,
+					   LZX_NUM_CHARS))
+			return -1;
+
+		if (lzx_read_codeword_lens(d, is,
+					   d->maincode_lens + LZX_NUM_CHARS,
+					   LZX_MAINCODE_NUM_SYMBOLS - LZX_NUM_CHARS))
+			return -1;
+
+		if (make_huffman_decode_table(d->maincode_decode_table,
+					      LZX_MAINCODE_NUM_SYMBOLS,
+					      LZX_MAINCODE_TABLEBITS,
+					      d->maincode_lens,
+					      LZX_MAX_MAIN_CODEWORD_LEN,
+					      d->working_space))
+			return -1;
+
+		/* Read the length code and prepare its decode table.  */
+
+		if (lzx_read_codeword_lens(d, is, d->lencode_lens,
+					   LZX_LENCODE_NUM_SYMBOLS))
+			return -1;
+
+		if (make_huffman_decode_table(d->lencode_decode_table,
+					      LZX_LENCODE_NUM_SYMBOLS,
+					      LZX_LENCODE_TABLEBITS,
+					      d->lencode_lens,
+					      LZX_MAX_LEN_CODEWORD_LEN,
+					      d->working_space))
+			return -1;
+
+		break;
+
+	case LZX_BLOCKTYPE_UNCOMPRESSED:
+
+		/* Before reading the three recent offsets from the uncompressed
+		 * block header, the stream must be aligned on a 16-bit
+		 * boundary.  But if the stream is *already* aligned, then the
+		 * next 16 bits must be discarded.
+		 */
+		bitstream_ensure_bits(is, 1);
+		bitstream_align(is);
+
+		recent_offsets[0] = bitstream_read_u32(is);
+		recent_offsets[1] = bitstream_read_u32(is);
+		recent_offsets[2] = bitstream_read_u32(is);
+
+		/* Offsets of 0 are invalid.  */
+		if (recent_offsets[0] == 0 || recent_offsets[1] == 0 ||
+		    recent_offsets[2] == 0)
+			return -1;
+		break;
+
+	default:
+		/* Unrecognized block type.  */
+		return -1;
+	}
+
+	*block_type_ret = block_type;
+	*block_size_ret = block_size;
+	return 0;
+}
+
+/* Decompress a block of LZX-compressed data.  */
+static int lzx_decompress_block(const struct lzx_decompressor *d,
+				struct input_bitstream *is,
+				int block_type, u32 block_size,
+				u8 * const out_begin, u8 *out_next,
+				u32 recent_offsets[])
+{
+	u8 * const block_end = out_next + block_size;
+	u32 ones_if_aligned = 0U - (block_type == LZX_BLOCKTYPE_ALIGNED);
+
+	do {
+		u32 mainsym;
+		u32 match_len;
+		u32 match_offset;
+		u32 offset_slot;
+		u32 num_extra_bits;
+
+		mainsym = read_mainsym(d, is);
+		if (mainsym < LZX_NUM_CHARS) {
+			/* Literal  */
+			*out_next++ = mainsym;
+			continue;
+		}
+
+		/* Match  */
+
+		/* Decode the length header and offset slot.  */
+		mainsym -= LZX_NUM_CHARS;
+		match_len = mainsym % LZX_NUM_LEN_HEADERS;
+		offset_slot = mainsym / LZX_NUM_LEN_HEADERS;
+
+		/* If needed, read a length symbol to decode the full length. */
+		if (match_len == LZX_NUM_PRIMARY_LENS)
+			match_len += read_lensym(d, is);
+		match_len += LZX_MIN_MATCH_LEN;
+
+		if (offset_slot < LZX_NUM_RECENT_OFFSETS) {
+			/* Repeat offset  */
+
+			/* Note: This isn't a real LRU queue, since using the R2
+			 * offset doesn't bump the R1 offset down to R2.  This
+			 * quirk allows all 3 recent offsets to be handled by
+			 * the same code.  (For R0, the swap is a no-op.)
+			 */
+			match_offset = recent_offsets[offset_slot];
+			recent_offsets[offset_slot] = recent_offsets[0];
+			recent_offsets[0] = match_offset;
+		} else {
+			/* Explicit offset  */
+
+			/* Look up the number of extra bits that need to be read
+			 * to decode offsets with this offset slot.
+			 */
+			num_extra_bits = lzx_extra_offset_bits[offset_slot];
+
+			/* Start with the offset slot base value.  */
+			match_offset = lzx_offset_slot_base[offset_slot];
+
+			/* In aligned offset blocks, the low-order 3 bits of
+			 * each offset are encoded using the aligned offset
+			 * code.  Otherwise, all the extra bits are literal.
+			 */
+
+			if ((num_extra_bits & ones_if_aligned) >= LZX_NUM_ALIGNED_OFFSET_BITS) {
+				match_offset +=
+					bitstream_read_bits(is, num_extra_bits -
+								LZX_NUM_ALIGNED_OFFSET_BITS)
+							<< LZX_NUM_ALIGNED_OFFSET_BITS;
+				match_offset += read_alignedsym(d, is);
+			} else {
+				match_offset += bitstream_read_bits(is, num_extra_bits);
+			}
+
+			/* Adjust the offset.  */
+			match_offset -= (LZX_NUM_RECENT_OFFSETS - 1);
+
+			/* Update the recent offsets.  */
+			recent_offsets[2] = recent_offsets[1];
+			recent_offsets[1] = recent_offsets[0];
+			recent_offsets[0] = match_offset;
+		}
+
+		/* Validate the match, then copy it to the current position.  */
+
+		if (match_len > (size_t)(block_end - out_next))
+			return -1;
+
+		if (match_offset > (size_t)(out_next - out_begin))
+			return -1;
+
+		out_next = lz_copy(out_next, match_len, match_offset,
+				   block_end, LZX_MIN_MATCH_LEN);
+
+	} while (out_next != block_end);
+
+	return 0;
+}
+
+/*
+ * lzx_allocate_decompressor - Allocate an LZX decompressor
+ *
+ * Return the pointer to the decompressor on success, or return NULL and set
+ * errno on failure.
+ */
+struct lzx_decompressor *lzx_allocate_decompressor(void)
+{
+	return kmalloc(sizeof(struct lzx_decompressor), GFP_NOFS);
+}
+
+/*
+ * lzx_decompress - Decompress a buffer of LZX-compressed data
+ *
+ * @decompressor:      A decompressor allocated with lzx_allocate_decompressor()
+ * @compressed_data:	The buffer of data to decompress
+ * @compressed_size:	Number of bytes of compressed data
+ * @uncompressed_data:	The buffer in which to store the decompressed data
+ * @uncompressed_size:	The number of bytes the data decompresses into
+ *
+ * Return 0 on success, or return -1 and set errno on failure.
+ */
+int lzx_decompress(struct lzx_decompressor *decompressor,
+		   const void *compressed_data, size_t compressed_size,
+		   void *uncompressed_data, size_t uncompressed_size)
+{
+	struct lzx_decompressor *d = decompressor;
+	u8 * const out_begin = uncompressed_data;
+	u8 *out_next = out_begin;
+	u8 * const out_end = out_begin + uncompressed_size;
+	struct input_bitstream is;
+	u32 recent_offsets[LZX_NUM_RECENT_OFFSETS] = {1, 1, 1};
+	int e8_status = 0;
+
+	init_input_bitstream(&is, compressed_data, compressed_size);
+
+	/* Codeword lengths begin as all 0's for delta encoding purposes.  */
+	memset(d->maincode_lens, 0, LZX_MAINCODE_NUM_SYMBOLS);
+	memset(d->lencode_lens, 0, LZX_LENCODE_NUM_SYMBOLS);
+
+	/* Decompress blocks until we have all the uncompressed data.  */
+
+	while (out_next != out_end) {
+		int block_type;
+		u32 block_size;
+
+		if (lzx_read_block_header(d, &is, &block_type, &block_size,
+					  recent_offsets))
+			goto invalid;
+
+		if (block_size < 1 || block_size > (size_t)(out_end - out_next))
+			goto invalid;
+
+		if (block_type != LZX_BLOCKTYPE_UNCOMPRESSED) {
+
+			/* Compressed block  */
+
+			if (lzx_decompress_block(d,
+						 &is,
+						 block_type,
+						 block_size,
+						 out_begin,
+						 out_next,
+						 recent_offsets))
+				goto invalid;
+
+			e8_status |= d->maincode_lens[0xe8];
+			out_next += block_size;
+		} else {
+			/* Uncompressed block  */
+
+			out_next = bitstream_read_bytes(&is, out_next,
+							block_size);
+			if (!out_next)
+				goto invalid;
+
+			if (block_size & 1)
+				bitstream_read_byte(&is);
+
+			e8_status = 1;
+		}
+	}
+
+	/* Postprocess the data unless it cannot possibly contain 0xe8 bytes. */
+	if (e8_status)
+		lzx_postprocess(uncompressed_data, uncompressed_size);
+
+	return 0;
+
+invalid:
+	return -1;
+}
+
+/*
+ * lzx_free_decompressor - Free an LZX decompressor
+ *
+ * @decompressor:       A decompressor that was allocated with
+ *			lzx_allocate_decompressor(), or NULL.
+ */
+void lzx_free_decompressor(struct lzx_decompressor *decompressor)
+{
+	kfree(decompressor);
+}
diff --git a/fs/ntfs3/lib/xpress_decompress.c b/fs/ntfs3/lib/xpress_decompress.c
new file mode 100644
index 000000000000..3d98f36a981e
--- /dev/null
+++ b/fs/ntfs3/lib/xpress_decompress.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * xpress_decompress.c - A decompressor for the XPRESS compression format
+ * (Huffman variant), which can be used in "System Compressed" files.  This is
+ * based on the code from wimlib.
+ *
+ * Copyright (C) 2015 Eric Biggers
+ *
+ * This program is free software: you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License as published by the Free Software
+ * Foundation, either version 2 of the License, or (at your option) any later
+ * version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "decompress_common.h"
+#include "lib.h"
+
+#define XPRESS_NUM_SYMBOLS	512
+#define XPRESS_MAX_CODEWORD_LEN	15
+#define XPRESS_MIN_MATCH_LEN	3
+
+/* This value is chosen for fast decompression.  */
+#define XPRESS_TABLEBITS 12
+
+/* Reusable heap-allocated memory for XPRESS decompression  */
+struct xpress_decompressor {
+
+	/* The Huffman decoding table  */
+	u16 decode_table[(1 << XPRESS_TABLEBITS) + 2 * XPRESS_NUM_SYMBOLS];
+
+	/* An array that maps symbols to codeword lengths  */
+	u8 lens[XPRESS_NUM_SYMBOLS];
+
+	/* Temporary space for make_huffman_decode_table()  */
+	u16 working_space[2 * (1 + XPRESS_MAX_CODEWORD_LEN) +
+			  XPRESS_NUM_SYMBOLS];
+};
+
+/*
+ * xpress_allocate_decompressor - Allocate an XPRESS decompressor
+ *
+ * Return the pointer to the decompressor on success, or return NULL and set
+ * errno on failure.
+ */
+struct xpress_decompressor *xpress_allocate_decompressor(void)
+{
+	return kmalloc(sizeof(struct xpress_decompressor), GFP_NOFS);
+}
+
+/*
+ * xpress_decompress - Decompress a buffer of XPRESS-compressed data
+ *
+ * @decompressor:       A decompressor that was allocated with
+ *			xpress_allocate_decompressor()
+ * @compressed_data:	The buffer of data to decompress
+ * @compressed_size:	Number of bytes of compressed data
+ * @uncompressed_data:	The buffer in which to store the decompressed data
+ * @uncompressed_size:	The number of bytes the data decompresses into
+ *
+ * Return 0 on success, or return -1 and set errno on failure.
+ */
+int xpress_decompress(struct xpress_decompressor *decompressor,
+		      const void *compressed_data, size_t compressed_size,
+		      void *uncompressed_data, size_t uncompressed_size)
+{
+	struct xpress_decompressor *d = decompressor;
+	const u8 * const in_begin = compressed_data;
+	u8 * const out_begin = uncompressed_data;
+	u8 *out_next = out_begin;
+	u8 * const out_end = out_begin + uncompressed_size;
+	struct input_bitstream is;
+	u32 i;
+
+	/* Read the Huffman codeword lengths.  */
+	if (compressed_size < XPRESS_NUM_SYMBOLS / 2)
+		goto invalid;
+	for (i = 0; i < XPRESS_NUM_SYMBOLS / 2; i++) {
+		d->lens[i*2 + 0] = in_begin[i] & 0xF;
+		d->lens[i*2 + 1] = in_begin[i] >> 4;
+	}
+
+	/* Build a decoding table for the Huffman code.  */
+	if (make_huffman_decode_table(d->decode_table, XPRESS_NUM_SYMBOLS,
+				      XPRESS_TABLEBITS, d->lens,
+				      XPRESS_MAX_CODEWORD_LEN,
+				      d->working_space))
+		goto invalid;
+
+	/* Decode the matches and literals.  */
+
+	init_input_bitstream(&is, in_begin + XPRESS_NUM_SYMBOLS / 2,
+			     compressed_size - XPRESS_NUM_SYMBOLS / 2);
+
+	while (out_next != out_end) {
+		u32 sym;
+		u32 log2_offset;
+		u32 length;
+		u32 offset;
+
+		sym = read_huffsym(&is, d->decode_table,
+				   XPRESS_TABLEBITS, XPRESS_MAX_CODEWORD_LEN);
+		if (sym < 256) {
+			/* Literal  */
+			*out_next++ = sym;
+		} else {
+			/* Match  */
+			length = sym & 0xf;
+			log2_offset = (sym >> 4) & 0xf;
+
+			bitstream_ensure_bits(&is, 16);
+
+			offset = ((u32)1 << log2_offset) |
+				 bitstream_pop_bits(&is, log2_offset);
+
+			if (length == 0xf) {
+				length += bitstream_read_byte(&is);
+				if (length == 0xf + 0xff)
+					length = bitstream_read_u16(&is);
+			}
+			length += XPRESS_MIN_MATCH_LEN;
+
+			if (offset > (size_t)(out_next - out_begin))
+				goto invalid;
+
+			if (length > (size_t)(out_end - out_next))
+				goto invalid;
+
+			out_next = lz_copy(out_next, length, offset, out_end,
+					   XPRESS_MIN_MATCH_LEN);
+		}
+	}
+	return 0;
+
+invalid:
+	return -1;
+}
+
+/*
+ * xpress_free_decompressor - Free an XPRESS decompressor
+ *
+ * @decompressor:       A decompressor that was allocated with
+ *			xpress_allocate_decompressor(), or NULL.
+ */
+void xpress_free_decompressor(struct xpress_decompressor *decompressor)
+{
+	kfree(decompressor);
+}
diff --git a/fs/ntfs3/lznt.c b/fs/ntfs3/lznt.c
new file mode 100644
index 000000000000..73dc108c3204
--- /dev/null
+++ b/fs/ntfs3/lznt.c
@@ -0,0 +1,452 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/nls.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+// clang-format off
+/* src buffer is zero */
+#define LZNT_ERROR_ALL_ZEROS	1
+#define LZNT_CHUNK_SIZE		0x1000
+// clang-format on
+
+struct lznt_hash {
+	const u8 *p1;
+	const u8 *p2;
+};
+
+struct lznt {
+	const u8 *unc;
+	const u8 *unc_end;
+	const u8 *best_match;
+	size_t max_len;
+	bool std;
+
+	struct lznt_hash hash[LZNT_CHUNK_SIZE];
+};
+
+static inline size_t get_match_len(const u8 *ptr, const u8 *end, const u8 *prev,
+				   size_t max_len)
+{
+	size_t len = 0;
+
+	while (ptr + len < end && ptr[len] == prev[len] && ++len < max_len)
+		;
+	return len;
+}
+
+static size_t longest_match_std(const u8 *src, struct lznt *ctx)
+{
+	size_t hash_index;
+	size_t len1 = 0, len2 = 0;
+	const u8 **hash;
+
+	hash_index =
+		((40543U * ((((src[0] << 4) ^ src[1]) << 4) ^ src[2])) >> 4) &
+		(LZNT_CHUNK_SIZE - 1);
+
+	hash = &(ctx->hash[hash_index].p1);
+
+	if (hash[0] >= ctx->unc && hash[0] < src && hash[0][0] == src[0] &&
+	    hash[0][1] == src[1] && hash[0][2] == src[2]) {
+		len1 = 3;
+		if (ctx->max_len > 3)
+			len1 += get_match_len(src + 3, ctx->unc_end,
+					      hash[0] + 3, ctx->max_len - 3);
+	}
+
+	if (hash[1] >= ctx->unc && hash[1] < src && hash[1][0] == src[0] &&
+	    hash[1][1] == src[1] && hash[1][2] == src[2]) {
+		len2 = 3;
+		if (ctx->max_len > 3)
+			len2 += get_match_len(src + 3, ctx->unc_end,
+					      hash[1] + 3, ctx->max_len - 3);
+	}
+
+	/* Compare two matches and select the best one */
+	if (len1 < len2) {
+		ctx->best_match = hash[1];
+		len1 = len2;
+	} else {
+		ctx->best_match = hash[0];
+	}
+
+	hash[1] = hash[0];
+	hash[0] = src;
+	return len1;
+}
+
+static size_t longest_match_best(const u8 *src, struct lznt *ctx)
+{
+	size_t max_len;
+	const u8 *ptr;
+
+	if (ctx->unc >= src || !ctx->max_len)
+		return 0;
+
+	max_len = 0;
+	for (ptr = ctx->unc; ptr < src; ++ptr) {
+		size_t len =
+			get_match_len(src, ctx->unc_end, ptr, ctx->max_len);
+		if (len >= max_len) {
+			max_len = len;
+			ctx->best_match = ptr;
+		}
+	}
+
+	return max_len >= 3 ? max_len : 0;
+}
+
+static const size_t s_max_len[] = {
+	0x1002, 0x802, 0x402, 0x202, 0x102, 0x82, 0x42, 0x22, 0x12,
+};
+
+static const size_t s_max_off[] = {
+	0x10, 0x20, 0x40, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000,
+};
+
+static inline u16 make_pair(size_t offset, size_t len, size_t index)
+{
+	return ((offset - 1) << (12 - index)) |
+	       ((len - 3) & (((1 << (12 - index)) - 1)));
+}
+
+static inline size_t parse_pair(u16 pair, size_t *offset, size_t index)
+{
+	*offset = 1 + (pair >> (12 - index));
+	return 3 + (pair & ((1 << (12 - index)) - 1));
+}
+
+/*
+ * compress_chunk
+ *
+ * returns one of the three values:
+ * 0 - ok, 'cmpr' contains 'cmpr_chunk_size' bytes of compressed data
+ * 1 - input buffer is full zero
+ * -2 - the compressed buffer is too small to hold the compressed data
+ */
+static inline int compress_chunk(size_t (*match)(const u8 *, struct lznt *),
+				 const u8 *unc, const u8 *unc_end, u8 *cmpr,
+				 u8 *cmpr_end, size_t *cmpr_chunk_size,
+				 struct lznt *ctx)
+{
+	size_t cnt = 0;
+	size_t idx = 0;
+	const u8 *up = unc;
+	u8 *cp = cmpr + 3;
+	u8 *cp2 = cmpr + 2;
+	u8 not_zero = 0;
+	/* Control byte of 8-bit values: ( 0 - means byte as is, 1 - short pair ) */
+	u8 ohdr = 0;
+	u8 *last;
+	u16 t16;
+
+	if (unc + LZNT_CHUNK_SIZE < unc_end)
+		unc_end = unc + LZNT_CHUNK_SIZE;
+
+	last = min(cmpr + LZNT_CHUNK_SIZE + sizeof(short), cmpr_end);
+
+	ctx->unc = unc;
+	ctx->unc_end = unc_end;
+	ctx->max_len = s_max_len[0];
+
+	while (up < unc_end) {
+		size_t max_len;
+
+		while (unc + s_max_off[idx] < up)
+			ctx->max_len = s_max_len[++idx];
+
+		// Find match
+		max_len = up + 3 <= unc_end ? (*match)(up, ctx) : 0;
+
+		if (!max_len) {
+			if (cp >= last)
+				goto NotCompressed;
+			not_zero |= *cp++ = *up++;
+		} else if (cp + 1 >= last) {
+			goto NotCompressed;
+		} else {
+			t16 = make_pair(up - ctx->best_match, max_len, idx);
+			*cp++ = t16;
+			*cp++ = t16 >> 8;
+
+			ohdr |= 1 << cnt;
+			up += max_len;
+		}
+
+		cnt = (cnt + 1) & 7;
+		if (!cnt) {
+			*cp2 = ohdr;
+			ohdr = 0;
+			cp2 = cp;
+			cp += 1;
+		}
+	}
+
+	if (cp2 < last)
+		*cp2 = ohdr;
+	else
+		cp -= 1;
+
+	*cmpr_chunk_size = cp - cmpr;
+
+	t16 = (*cmpr_chunk_size - 3) | 0xB000;
+	cmpr[0] = t16;
+	cmpr[1] = t16 >> 8;
+
+	return not_zero ? 0 : LZNT_ERROR_ALL_ZEROS;
+
+NotCompressed:
+
+	if ((cmpr + LZNT_CHUNK_SIZE + sizeof(short)) > last)
+		return -2;
+
+	/*
+	 * Copy non cmpr data
+	 * 0x3FFF == ((LZNT_CHUNK_SIZE + 2 - 3) | 0x3000)
+	 */
+	cmpr[0] = 0xff;
+	cmpr[1] = 0x3f;
+
+	memcpy(cmpr + sizeof(short), unc, LZNT_CHUNK_SIZE);
+	*cmpr_chunk_size = LZNT_CHUNK_SIZE + sizeof(short);
+
+	return 0;
+}
+
+static inline ssize_t decompress_chunk(u8 *unc, u8 *unc_end, const u8 *cmpr,
+				       const u8 *cmpr_end)
+{
+	u8 *up = unc;
+	u8 ch = *cmpr++;
+	size_t bit = 0;
+	size_t index = 0;
+	u16 pair;
+	size_t offset, length;
+
+	/* Do decompression until pointers are inside range */
+	while (up < unc_end && cmpr < cmpr_end) {
+		/* Correct index */
+		while (unc + s_max_off[index] < up)
+			index += 1;
+
+		/* Check the current flag for zero */
+		if (!(ch & (1 << bit))) {
+			/* Just copy byte */
+			*up++ = *cmpr++;
+			goto next;
+		}
+
+		/* Check for boundary */
+		if (cmpr + 1 >= cmpr_end)
+			return -EINVAL;
+
+		/* Read a short from little endian stream */
+		pair = cmpr[1];
+		pair <<= 8;
+		pair |= cmpr[0];
+
+		cmpr += 2;
+
+		/* Translate packed information into offset and length */
+		length = parse_pair(pair, &offset, index);
+
+		/* Check offset for boundary */
+		if (unc + offset > up)
+			return -EINVAL;
+
+		/* Truncate the length if necessary */
+		if (up + length >= unc_end)
+			length = unc_end - up;
+
+		/* Now we copy bytes. This is the heart of LZ algorithm. */
+		for (; length > 0; length--, up++)
+			*up = *(up - offset);
+
+next:
+		/* Advance flag bit value */
+		bit = (bit + 1) & 7;
+
+		if (!bit) {
+			if (cmpr >= cmpr_end)
+				break;
+
+			ch = *cmpr++;
+		}
+	}
+
+	/* return the size of uncompressed data */
+	return up - unc;
+}
+
+/*
+ * 0 - standard compression
+ * !0 - best compression, requires a lot of cpu
+ */
+struct lznt *get_lznt_ctx(int level)
+{
+	struct lznt *r = ntfs_zalloc(level ? offsetof(struct lznt, hash) :
+					     sizeof(struct lznt));
+
+	if (r)
+		r->std = !level;
+	return r;
+}
+
+/*
+ * compress_lznt
+ *
+ * Compresses "unc" into "cmpr"
+ * +x - ok, 'cmpr' contains 'final_compressed_size' bytes of compressed data
+ * 0 - input buffer is full zero
+ */
+size_t compress_lznt(const void *unc, size_t unc_size, void *cmpr,
+		     size_t cmpr_size, struct lznt *ctx)
+{
+	int err;
+	size_t (*match)(const u8 *src, struct lznt *ctx);
+	u8 *p = cmpr;
+	u8 *end = p + cmpr_size;
+	const u8 *unc_chunk = unc;
+	const u8 *unc_end = unc_chunk + unc_size;
+	bool is_zero = true;
+
+	if (ctx->std) {
+		match = &longest_match_std;
+		memset(ctx->hash, 0, sizeof(ctx->hash));
+	} else {
+		match = &longest_match_best;
+	}
+
+	/* compression cycle */
+	for (; unc_chunk < unc_end; unc_chunk += LZNT_CHUNK_SIZE) {
+		cmpr_size = 0;
+		err = compress_chunk(match, unc_chunk, unc_end, p, end,
+				     &cmpr_size, ctx);
+		if (err < 0)
+			return unc_size;
+
+		if (is_zero && err != LZNT_ERROR_ALL_ZEROS)
+			is_zero = false;
+
+		p += cmpr_size;
+	}
+
+	if (p <= end - 2)
+		p[0] = p[1] = 0;
+
+	return is_zero ? 0 : PtrOffset(cmpr, p);
+}
+
+/*
+ * decompress_lznt
+ *
+ * decompresses "cmpr" into "unc"
+ */
+ssize_t decompress_lznt(const void *cmpr, size_t cmpr_size, void *unc,
+			size_t unc_size)
+{
+	const u8 *cmpr_chunk = cmpr;
+	const u8 *cmpr_end = cmpr_chunk + cmpr_size;
+	u8 *unc_chunk = unc;
+	u8 *unc_end = unc_chunk + unc_size;
+	u16 chunk_hdr;
+
+	if (cmpr_size < sizeof(short))
+		return -EINVAL;
+
+	/* read chunk header */
+	chunk_hdr = cmpr_chunk[1];
+	chunk_hdr <<= 8;
+	chunk_hdr |= cmpr_chunk[0];
+
+	/* loop through decompressing chunks */
+	for (;;) {
+		size_t chunk_size_saved;
+		size_t unc_use;
+		size_t cmpr_use = 3 + (chunk_hdr & (LZNT_CHUNK_SIZE - 1));
+
+		/* Check that the chunk actually fits the supplied buffer */
+		if (cmpr_chunk + cmpr_use > cmpr_end)
+			return -EINVAL;
+
+		/* First make sure the chunk contains compressed data */
+		if (chunk_hdr & 0x8000) {
+			/* Decompress a chunk and return if we get an error */
+			ssize_t err =
+				decompress_chunk(unc_chunk, unc_end,
+						 cmpr_chunk + sizeof(chunk_hdr),
+						 cmpr_chunk + cmpr_use);
+			if (err < 0)
+				return err;
+			unc_use = err;
+		} else {
+			/* This chunk does not contain compressed data */
+			unc_use = unc_chunk + LZNT_CHUNK_SIZE > unc_end ?
+					  unc_end - unc_chunk :
+					  LZNT_CHUNK_SIZE;
+
+			if (cmpr_chunk + sizeof(chunk_hdr) + unc_use >
+			    cmpr_end) {
+				return -EINVAL;
+			}
+
+			memcpy(unc_chunk, cmpr_chunk + sizeof(chunk_hdr),
+			       unc_use);
+		}
+
+		/* Advance pointers */
+		cmpr_chunk += cmpr_use;
+		unc_chunk += unc_use;
+
+		/* Check for the end of unc buffer */
+		if (unc_chunk >= unc_end)
+			break;
+
+		/* Proceed the next chunk */
+		if (cmpr_chunk > cmpr_end - 2)
+			break;
+
+		chunk_size_saved = LZNT_CHUNK_SIZE;
+
+		/* read chunk header */
+		chunk_hdr = cmpr_chunk[1];
+		chunk_hdr <<= 8;
+		chunk_hdr |= cmpr_chunk[0];
+
+		if (!chunk_hdr)
+			break;
+
+		/* Check the size of unc buffer */
+		if (unc_use < chunk_size_saved) {
+			size_t t1 = chunk_size_saved - unc_use;
+			u8 *t2 = unc_chunk + t1;
+
+			/* 'Zero' memory */
+			if (t2 >= unc_end)
+				break;
+
+			memset(unc_chunk, 0, t1);
+			unc_chunk = t2;
+		}
+	}
+
+	/* Check compression boundary */
+	if (cmpr_chunk > cmpr_end)
+		return -EINVAL;
+
+	/*
+	 * The unc size is just a difference between current
+	 * pointer and original one
+	 */
+	return PtrOffset(unc, unc_chunk);
+}

From patchwork Thu Jan 28 09:04:52 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053223
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 7D31BC433DB
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:09:40 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 1887C64DE7
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:09:40 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232229AbhA1JIy (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:54 -0500
Received: from relaydlg-01.paragon-software.com ([81.5.88.159]:37282 "EHLO
        relaydlg-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231408AbhA1JGe (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:34 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relaydlg-01.paragon-software.com (Postfix) with ESMTPS id
 A858D82303;
        Thu, 28 Jan 2021 12:05:04 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824704;
        bh=UKz33/EL49jdc1VbGOU9DKSeN9j5aXnXWkMVyjPw5Ks=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=GHzD59SogtQVxzVI4tqAT5BW+JA2lKCso6uzsI7Iuo30zk7wwrjNfTRZpJdbBtV/4
         s//gzFuF5u/bse3CLf1Qp7SYUtLL1CjenCNFK4K05U9IrgvOj9LxGNAHWHfd884pc4
         wyBVksdCOeqeuiG4cG2FYEIGPiCZmLDa4/CaC6PY=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:03 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 07/10] fs/ntfs3: Add NTFS journal
Date: Thu, 28 Jan 2021 12:04:52 +0300
Message-ID: 
 <20210128090455.3576502-8-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds NTFS journal

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/ntfs3/fslog.c | 5204 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 5204 insertions(+)
 create mode 100644 fs/ntfs3/fslog.c

diff --git a/fs/ntfs3/fslog.c b/fs/ntfs3/fslog.c
new file mode 100644
index 000000000000..1aeaa25441a2
--- /dev/null
+++ b/fs/ntfs3/fslog.c
@@ -0,0 +1,5204 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright (C) 2019-2020 Paragon Software GmbH, All rights reserved.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/buffer_head.h>
+#include <linux/fs.h>
+#include <linux/hash.h>
+#include <linux/nls.h>
+#include <linux/random.h>
+#include <linux/ratelimit.h>
+#include <linux/slab.h>
+
+#include "debug.h"
+#include "ntfs.h"
+#include "ntfs_fs.h"
+
+/*
+ * LOG FILE structs
+ */
+
+// clang-format off
+
+#define MaxLogFileSize     0x100000000ull
+#define DefaultLogPageSize 4096
+#define MinLogRecordPages  0x30
+
+struct RESTART_HDR {
+	struct NTFS_RECORD_HEADER rhdr; // 'RSTR'
+	__le32 sys_page_size; // 0x10: Page size of the system which initialized the log
+	__le32 page_size;     // 0x14: Log page size used for this log file
+	__le16 ra_off;        // 0x18:
+	__le16 minor_ver;     // 0x1A:
+	__le16 major_ver;     // 0x1C:
+	__le16 fixups[1];
+};
+
+#define LFS_NO_CLIENT 0xffff
+#define LFS_NO_CLIENT_LE cpu_to_le16(0xffff)
+
+struct CLIENT_REC {
+	__le64 oldest_lsn;
+	__le64 restart_lsn; // 0x08:
+	__le16 prev_client; // 0x10:
+	__le16 next_client; // 0x12:
+	__le16 seq_num;     // 0x14:
+	u8 align[6];        // 0x16
+	__le32 name_bytes;  // 0x1C: in bytes
+	__le16 name[32];    // 0x20: name of client
+};
+
+static_assert(sizeof(struct CLIENT_REC) == 0x60);
+
+/* Two copies of these will exist at the beginning of the log file */
+struct RESTART_AREA {
+	__le64 current_lsn;    // 0x00: Current logical end of log file
+	__le16 log_clients;    // 0x08: Maximum number of clients
+	__le16 client_idx[2];  // 0x0A: free/use index into the client record arrays
+	__le16 flags;          // 0x0E: See RESTART_SINGLE_PAGE_IO
+	__le32 seq_num_bits;   // 0x10: the number of bits in sequence number.
+	__le16 ra_len;         // 0x14:
+	__le16 client_off;     // 0x16:
+	__le64 l_size;         // 0x18: Usable log file size.
+	__le32 last_lsn_data_len; // 0x20:
+	__le16 rec_hdr_len;    // 0x24: log page data offset
+	__le16 data_off;       // 0x26: log page data length
+	__le32 open_log_count; // 0x28:
+	__le32 align[5];       // 0x2C:
+	struct CLIENT_REC clients[1]; // 0x40:
+};
+
+struct LOG_REC_HDR {
+	__le16 redo_op;      // 0x00:  NTFS_LOG_OPERATION
+	__le16 undo_op;      // 0x02:  NTFS_LOG_OPERATION
+	__le16 redo_off;     // 0x04:  Offset to Redo record
+	__le16 redo_len;     // 0x06:  Redo length
+	__le16 undo_off;     // 0x08:  Offset to Undo record
+	__le16 undo_len;     // 0x0A:  Undo length
+	__le16 target_attr;  // 0x0C:
+	__le16 lcns_follow;  // 0x0E:
+	__le16 record_off;   // 0x10:
+	__le16 attr_off;     // 0x12:
+	__le16 cluster_off;  // 0x14:
+	__le16 reserved;     // 0x16:
+	__le64 target_vcn;   // 0x18:
+	__le64 page_lcns[1]; // 0x20:
+};
+
+static_assert(sizeof(struct LOG_REC_HDR) == 0x28);
+
+#define RESTART_ENTRY_ALLOCATED    0xFFFFFFFF
+#define RESTART_ENTRY_ALLOCATED_LE cpu_to_le32(0xFFFFFFFF)
+
+struct RESTART_TABLE {
+	__le16 size;       // 0x00:  In bytes
+	__le16 used;       // 0x02: entries
+	__le16 total;      // 0x04: entries
+	__le16 res[3];     // 0x06:
+	__le32 free_goal;  // 0x0C:
+	__le32 first_free; // 0x10
+	__le32 last_free;  // 0x14
+
+};
+
+static_assert(sizeof(struct RESTART_TABLE) == 0x18);
+
+struct ATTR_NAME_ENTRY {
+	__le16 off; // offset in the Open attribute Table
+	__le16 name_bytes;
+	__le16 name[1];
+};
+
+struct OPEN_ATTR_ENRTY {
+	__le32 next;            // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+	__le32 bytes_per_index; // 0x04:
+	enum ATTR_TYPE type;    // 0x08:
+	u8 is_dirty_pages;      // 0x0C:
+	u8 is_attr_name;        // 0x0B: Faked field to manage 'ptr'
+	u8 name_len;            // 0x0C: Faked field to manage 'ptr'
+	u8 res;
+	struct MFT_REF ref; // 0x10: File Reference of file containing attribute
+	__le64 open_record_lsn; // 0x18:
+	void *ptr;              // 0x20:
+};
+
+/* 32 bit version of 'struct OPEN_ATTR_ENRTY' */
+struct OPEN_ATTR_ENRTY_32 {
+	__le32 next;            // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+	__le32 ptr;             // 0x04:
+	struct MFT_REF ref;     // 0x08:
+	__le64 open_record_lsn; // 0x10:
+	u8 is_dirty_pages;      // 0x18:
+	u8 is_attr_name;        // 0x19
+	u8 res1[2];
+	enum ATTR_TYPE type;    // 0x1C:
+	u8 name_len;            // 0x20:  in wchar
+	u8 res2[3];
+	__le32 AttributeName;   // 0x24:
+	__le32 bytes_per_index; // 0x28:
+};
+
+#define SIZEOF_OPENATTRIBUTEENTRY0 0x2c
+// static_assert( 0x2C == sizeof(struct OPEN_ATTR_ENRTY_32) );
+static_assert(sizeof(struct OPEN_ATTR_ENRTY) < SIZEOF_OPENATTRIBUTEENTRY0);
+
+/*
+ * One entry exists in the Dirty Pages Table for each page which is dirty at the
+ * time the Restart Area is written
+ */
+struct DIR_PAGE_ENTRY {
+	__le32 next;         // 0x00:  RESTART_ENTRY_ALLOCATED if allocated
+	__le32 target_attr;  // 0x04:  Index into the Open attribute Table
+	__le32 transfer_len; // 0x08:
+	__le32 lcns_follow;  // 0x0C:
+	__le64 vcn;          // 0x10:  Vcn of dirty page
+	__le64 oldest_lsn;   // 0x18:
+	__le64 page_lcns[1]; // 0x20:
+};
+
+static_assert(sizeof(struct DIR_PAGE_ENTRY) == 0x28);
+
+/* 32 bit version of 'struct DIR_PAGE_ENTRY' */
+struct DIR_PAGE_ENTRY_32 {
+	__le32 next;         // 0x00:  RESTART_ENTRY_ALLOCATED if allocated
+	__le32 target_attr;  // 0x04:  Index into the Open attribute Table
+	__le32 transfer_len; // 0x08:
+	__le32 lcns_follow;  // 0x0C:
+	__le32 reserved;     // 0x10:
+	__le32 vcn_low;      // 0x14:  Vcn of dirty page
+	__le32 vcn_hi;       // 0x18:  Vcn of dirty page
+	__le32 oldest_lsn_low; // 0x1C:
+	__le32 oldest_lsn_hi; // 0x1C:
+	__le32 page_lcns_low; // 0x24:
+	__le32 page_lcns_hi; // 0x24:
+};
+
+static_assert(offsetof(struct DIR_PAGE_ENTRY_32, vcn_low) == 0x14);
+static_assert(sizeof(struct DIR_PAGE_ENTRY_32) == 0x2c);
+
+enum transact_state {
+	TransactionUninitialized = 0,
+	TransactionActive,
+	TransactionPrepared,
+	TransactionCommitted
+};
+
+struct TRANSACTION_ENTRY {
+	__le32 next;          // 0x00: RESTART_ENTRY_ALLOCATED if allocated
+	u8 transact_state;    // 0x04:
+	u8 reserved[3];       // 0x05:
+	__le64 first_lsn;     // 0x08:
+	__le64 prev_lsn;      // 0x10:
+	__le64 undo_next_lsn; // 0x18:
+	__le32 undo_records;  // 0x20: Number of undo log records pending abort
+	__le32 undo_len;      // 0x24: Total undo size
+};
+
+static_assert(sizeof(struct TRANSACTION_ENTRY) == 0x28);
+
+struct NTFS_RESTART {
+	__le32 major_ver;             // 0x00:
+	__le32 minor_ver;             // 0x04:
+	__le64 check_point_start;     // 0x08:
+	__le64 open_attr_table_lsn;   // 0x10:
+	__le64 attr_names_lsn;        // 0x18:
+	__le64 dirty_pages_table_lsn; // 0x20:
+	__le64 transact_table_lsn;    // 0x28:
+	__le32 open_attr_len;         // 0x30: In bytes
+	__le32 attr_names_len;        // 0x34: In bytes
+	__le32 dirty_pages_len;       // 0x38: In bytes
+	__le32 transact_table_len;    // 0x3C: In bytes
+};
+
+static_assert(sizeof(struct NTFS_RESTART) == 0x40);
+
+struct NEW_ATTRIBUTE_SIZES {
+	__le64 alloc_size;
+	__le64 valid_size;
+	__le64 data_size;
+	__le64 total_size;
+};
+
+struct BITMAP_RANGE {
+	__le32 bitmap_off;
+	__le32 bits;
+};
+
+struct LCN_RANGE {
+	__le64 lcn;
+	__le64 len;
+};
+
+/* The following type defines the different log record types */
+#define LfsClientRecord  cpu_to_le32(1)
+#define LfsClientRestart cpu_to_le32(2)
+
+/* This is used to uniquely identify a client for a particular log file */
+struct CLIENT_ID {
+	__le16 seq_num;
+	__le16 client_idx;
+};
+
+/* This is the header that begins every Log Record in the log file */
+struct LFS_RECORD_HDR {
+	__le64 this_lsn;    // 0x00:
+	__le64 client_prev_lsn;  // 0x08:
+	__le64 client_undo_next_lsn; // 0x10:
+	__le32 client_data_len;  // 0x18:
+	struct CLIENT_ID client; // 0x1C: Owner of this log record
+	__le32 record_type; // 0x20: LfsClientRecord or LfsClientRestart
+	__le32 transact_id; // 0x24:
+	__le16 flags;       // 0x28:	LOG_RECORD_MULTI_PAGE
+	u8 align[6];        // 0x2A:
+};
+
+#define LOG_RECORD_MULTI_PAGE cpu_to_le16(1)
+
+static_assert(sizeof(struct LFS_RECORD_HDR) == 0x30);
+
+struct LFS_RECORD {
+	__le16 next_record_off; // 0x00: Offset of the free space in the page
+	u8 align[6];         // 0x02:
+	__le64 last_end_lsn; // 0x08: lsn for the last log record which ends on the page
+};
+
+static_assert(sizeof(struct LFS_RECORD) == 0x10);
+
+struct RECORD_PAGE_HDR {
+	struct NTFS_RECORD_HEADER rhdr; // 'RCRD'
+	__le32 rflags;     // 0x10:  See LOG_PAGE_LOG_RECORD_END
+	__le16 page_count; // 0x14:
+	__le16 page_pos;   // 0x16:
+	struct LFS_RECORD record_hdr; // 0x18
+	__le16 fixups[10]; // 0x28
+	__le32 file_off;   // 0x3c: used when major version >= 2
+};
+
+// clang-format on
+
+// Page contains the end of a log record
+#define LOG_PAGE_LOG_RECORD_END cpu_to_le32(0x00000001)
+
+static inline bool is_log_record_end(const struct RECORD_PAGE_HDR *hdr)
+{
+	return hdr->rflags & LOG_PAGE_LOG_RECORD_END;
+}
+
+static_assert(offsetof(struct RECORD_PAGE_HDR, file_off) == 0x3c);
+
+/*
+ * END of NTFS LOG structures
+ */
+
+/* Define some tuning parameters to keep the restart tables a reasonable size */
+#define INITIAL_NUMBER_TRANSACTIONS 5
+
+enum NTFS_LOG_OPERATION {
+
+	Noop = 0x00,
+	CompensationLogRecord = 0x01,
+	InitializeFileRecordSegment = 0x02,
+	DeallocateFileRecordSegment = 0x03,
+	WriteEndOfFileRecordSegment = 0x04,
+	CreateAttribute = 0x05,
+	DeleteAttribute = 0x06,
+	UpdateResidentValue = 0x07,
+	UpdateNonresidentValue = 0x08,
+	UpdateMappingPairs = 0x09,
+	DeleteDirtyClusters = 0x0A,
+	SetNewAttributeSizes = 0x0B,
+	AddIndexEntryRoot = 0x0C,
+	DeleteIndexEntryRoot = 0x0D,
+	AddIndexEntryAllocation = 0x0E,
+	DeleteIndexEntryAllocation = 0x0F,
+	WriteEndOfIndexBuffer = 0x10,
+	SetIndexEntryVcnRoot = 0x11,
+	SetIndexEntryVcnAllocation = 0x12,
+	UpdateFileNameRoot = 0x13,
+	UpdateFileNameAllocation = 0x14,
+	SetBitsInNonresidentBitMap = 0x15,
+	ClearBitsInNonresidentBitMap = 0x16,
+	HotFix = 0x17,
+	EndTopLevelAction = 0x18,
+	PrepareTransaction = 0x19,
+	CommitTransaction = 0x1A,
+	ForgetTransaction = 0x1B,
+	OpenNonresidentAttribute = 0x1C,
+	OpenAttributeTableDump = 0x1D,
+	AttributeNamesDump = 0x1E,
+	DirtyPageTableDump = 0x1F,
+	TransactionTableDump = 0x20,
+	UpdateRecordDataRoot = 0x21,
+	UpdateRecordDataAllocation = 0x22,
+
+	UpdateRelativeDataInIndex =
+		0x23, // NtOfsRestartUpdateRelativeDataInIndex
+	UpdateRelativeDataInIndex2 = 0x24,
+	ZeroEndOfFileRecord = 0x25,
+};
+
+/*
+ * Array for log records which require a target attribute
+ * A true indicates that the corresponding restart operation requires a target attribute
+ */
+static const u8 AttributeRequired[] = {
+	0xFC, 0xFB, 0xFF, 0x10, 0x06,
+};
+
+static inline bool is_target_required(u16 op)
+{
+	bool ret = op <= UpdateRecordDataAllocation &&
+		   (AttributeRequired[op >> 3] >> (op & 7) & 1);
+	return ret;
+}
+
+static inline bool can_skip_action(enum NTFS_LOG_OPERATION op)
+{
+	switch (op) {
+	case Noop:
+	case DeleteDirtyClusters:
+	case HotFix:
+	case EndTopLevelAction:
+	case PrepareTransaction:
+	case CommitTransaction:
+	case ForgetTransaction:
+	case CompensationLogRecord:
+	case OpenNonresidentAttribute:
+	case OpenAttributeTableDump:
+	case AttributeNamesDump:
+	case DirtyPageTableDump:
+	case TransactionTableDump:
+		return true;
+	default:
+		return false;
+	}
+}
+
+enum { lcb_ctx_undo_next, lcb_ctx_prev, lcb_ctx_next };
+
+/* bytes per restart table */
+static inline u32 bytes_per_rt(const struct RESTART_TABLE *rt)
+{
+	return le16_to_cpu(rt->used) * le16_to_cpu(rt->size) +
+	       sizeof(struct RESTART_TABLE);
+}
+
+/* log record length */
+static inline u32 lrh_length(const struct LOG_REC_HDR *lr)
+{
+	u16 t16 = le16_to_cpu(lr->lcns_follow);
+
+	return t16 > 1 ? sizeof(struct LOG_REC_HDR) + (t16 - 1) * sizeof(u64) :
+			 sizeof(struct LOG_REC_HDR);
+}
+
+struct lcb {
+	struct LFS_RECORD_HDR *lrh; // Log record header of the current lsn
+	struct LOG_REC_HDR *log_rec;
+	u32 ctx_mode; // lcb_ctx_undo_next/lcb_ctx_prev/lcb_ctx_next
+	struct CLIENT_ID client;
+	bool alloc; // if true the we should deallocate 'log_rec'
+};
+
+static void lcb_put(struct lcb *lcb)
+{
+	if (lcb->alloc)
+		ntfs_free(lcb->log_rec);
+	ntfs_free(lcb->lrh);
+	ntfs_free(lcb);
+}
+
+/*
+ * oldest_client_lsn
+ *
+ * find the oldest lsn from active clients.
+ */
+static inline void oldest_client_lsn(const struct CLIENT_REC *ca,
+				     __le16 next_client, u64 *oldest_lsn)
+{
+	while (next_client != LFS_NO_CLIENT_LE) {
+		const struct CLIENT_REC *cr = ca + le16_to_cpu(next_client);
+		u64 lsn = le64_to_cpu(cr->oldest_lsn);
+
+		/* ignore this block if it's oldest lsn is 0 */
+		if (lsn && lsn < *oldest_lsn)
+			*oldest_lsn = lsn;
+
+		next_client = cr->next_client;
+	}
+}
+
+static inline bool is_rst_page_hdr_valid(u32 file_off,
+					 const struct RESTART_HDR *rhdr)
+{
+	u32 sys_page = le32_to_cpu(rhdr->sys_page_size);
+	u32 page_size = le32_to_cpu(rhdr->page_size);
+	u32 end_usa;
+	u16 ro;
+
+	if (sys_page < SECTOR_SIZE || page_size < SECTOR_SIZE ||
+	    sys_page & (sys_page - 1) || page_size & (page_size - 1)) {
+		return false;
+	}
+
+	/* Check that if the file offset isn't 0, it is the system page size */
+	if (file_off && file_off != sys_page)
+		return false;
+
+	/* Check support version 1.1+ */
+	if (le16_to_cpu(rhdr->major_ver) <= 1 && !rhdr->minor_ver)
+		return false;
+
+	if (le16_to_cpu(rhdr->major_ver) > 2)
+		return false;
+
+	ro = le16_to_cpu(rhdr->ra_off);
+	if (!IsQuadAligned(ro) || ro > sys_page)
+		return false;
+
+	end_usa = ((sys_page >> SECTOR_SHIFT) + 1) * sizeof(short);
+	end_usa += le16_to_cpu(rhdr->rhdr.fix_off);
+
+	if (ro < end_usa)
+		return false;
+
+	return true;
+}
+
+static inline bool is_rst_area_valid(const struct RESTART_HDR *rhdr)
+{
+	const struct RESTART_AREA *ra;
+	u16 cl, fl, ul;
+	u32 off, l_size, file_dat_bits, file_size_round;
+	u16 ro = le16_to_cpu(rhdr->ra_off);
+	u32 sys_page = le32_to_cpu(rhdr->sys_page_size);
+
+	if (ro + offsetof(struct RESTART_AREA, l_size) >
+	    SECTOR_SIZE - sizeof(short))
+		return false;
+
+	ra = Add2Ptr(rhdr, ro);
+	cl = le16_to_cpu(ra->log_clients);
+
+	if (cl > 1)
+		return false;
+
+	off = le16_to_cpu(ra->client_off);
+
+	if (!IsQuadAligned(off) || ro + off > SECTOR_SIZE - sizeof(short))
+		return false;
+
+	off += cl * sizeof(struct CLIENT_REC);
+
+	if (off > sys_page)
+		return false;
+
+	/*
+	 * Check the restart length field and whether the entire
+	 * restart area is contained that length
+	 */
+	if (le16_to_cpu(rhdr->ra_off) + le16_to_cpu(ra->ra_len) > sys_page ||
+	    off > le16_to_cpu(ra->ra_len)) {
+		return false;
+	}
+
+	/*
+	 * As a final check make sure that the use list and the free list
+	 * are either empty or point to a valid client
+	 */
+	fl = le16_to_cpu(ra->client_idx[0]);
+	ul = le16_to_cpu(ra->client_idx[1]);
+	if ((fl != LFS_NO_CLIENT && fl >= cl) ||
+	    (ul != LFS_NO_CLIENT && ul >= cl))
+		return false;
+
+	/* Make sure the sequence number bits match the log file size */
+	l_size = le64_to_cpu(ra->l_size);
+
+	file_dat_bits = sizeof(u64) * 8 - le32_to_cpu(ra->seq_num_bits);
+	file_size_round = 1u << (file_dat_bits + 3);
+	if (file_size_round != l_size &&
+	    (file_size_round < l_size || (file_size_round / 2) > l_size)) {
+		return false;
+	}
+
+	/* The log page data offset and record header length must be quad-aligned */
+	if (!IsQuadAligned(le16_to_cpu(ra->data_off)) ||
+	    !IsQuadAligned(le16_to_cpu(ra->rec_hdr_len)))
+		return false;
+
+	return true;
+}
+
+static inline bool is_client_area_valid(const struct RESTART_HDR *rhdr,
+					bool usa_error)
+{
+	u16 ro = le16_to_cpu(rhdr->ra_off);
+	const struct RESTART_AREA *ra = Add2Ptr(rhdr, ro);
+	u16 ra_len = le16_to_cpu(ra->ra_len);
+	const struct CLIENT_REC *ca;
+	u32 i;
+
+	if (usa_error && ra_len + ro > SECTOR_SIZE - sizeof(short))
+		return false;
+
+	/* Find the start of the client array */
+	ca = Add2Ptr(ra, le16_to_cpu(ra->client_off));
+
+	/*
+	 * Start with the free list
+	 * Check that all the clients are valid and that there isn't a cycle
+	 * Do the in-use list on the second pass
+	 */
+	for (i = 0; i < 2; i++) {
+		u16 client_idx = le16_to_cpu(ra->client_idx[i]);
+		bool first_client = true;
+		u16 clients = le16_to_cpu(ra->log_clients);
+
+		while (client_idx != LFS_NO_CLIENT) {
+			const struct CLIENT_REC *cr;
+
+			if (!clients ||
+			    client_idx >= le16_to_cpu(ra->log_clients))
+				return false;
+
+			clients -= 1;
+			cr = ca + client_idx;
+
+			client_idx = le16_to_cpu(cr->next_client);
+
+			if (first_client) {
+				first_client = false;
+				if (cr->prev_client != LFS_NO_CLIENT_LE)
+					return false;
+			}
+		}
+	}
+
+	return true;
+}
+
+/*
+ * remove_client
+ *
+ * remove a client record from a client record list an restart area
+ */
+static inline void remove_client(struct CLIENT_REC *ca,
+				 const struct CLIENT_REC *cr, __le16 *head)
+{
+	if (cr->prev_client == LFS_NO_CLIENT_LE)
+		*head = cr->next_client;
+	else
+		ca[le16_to_cpu(cr->prev_client)].next_client = cr->next_client;
+
+	if (cr->next_client != LFS_NO_CLIENT_LE)
+		ca[le16_to_cpu(cr->next_client)].prev_client = cr->prev_client;
+}
+
+/*
+ * add_client
+ *
+ * add a client record to the start of a list
+ */
+static inline void add_client(struct CLIENT_REC *ca, u16 index, __le16 *head)
+{
+	struct CLIENT_REC *cr = ca + index;
+
+	cr->prev_client = LFS_NO_CLIENT_LE;
+	cr->next_client = *head;
+
+	if (*head != LFS_NO_CLIENT_LE)
+		ca[le16_to_cpu(*head)].prev_client = cpu_to_le16(index);
+
+	*head = cpu_to_le16(index);
+}
+
+/*
+ * enum_rstbl
+ *
+ */
+static inline void *enum_rstbl(struct RESTART_TABLE *t, void *c)
+{
+	__le32 *e;
+	u32 bprt;
+	u16 rsize = t ? le16_to_cpu(t->size) : 0;
+
+	if (!c) {
+		if (!t || !t->total)
+			return NULL;
+		e = Add2Ptr(t, sizeof(struct RESTART_TABLE));
+	} else {
+		e = Add2Ptr(c, rsize);
+	}
+
+	/* Loop until we hit the first one allocated, or the end of the list */
+	for (bprt = bytes_per_rt(t); PtrOffset(t, e) < bprt;
+	     e = Add2Ptr(e, rsize)) {
+		if (*e == RESTART_ENTRY_ALLOCATED_LE)
+			return e;
+	}
+	return NULL;
+}
+
+/*
+ * find_dp
+ *
+ * searches for a 'vcn' in Dirty Page Table,
+ */
+static inline struct DIR_PAGE_ENTRY *find_dp(struct RESTART_TABLE *dptbl,
+					     u32 target_attr, u64 vcn)
+{
+	__le32 ta = cpu_to_le32(target_attr);
+	struct DIR_PAGE_ENTRY *dp = NULL;
+
+	while ((dp = enum_rstbl(dptbl, dp))) {
+		u64 dp_vcn = le64_to_cpu(dp->vcn);
+
+		if (dp->target_attr == ta && vcn >= dp_vcn &&
+		    vcn < dp_vcn + le32_to_cpu(dp->lcns_follow)) {
+			return dp;
+		}
+	}
+	return NULL;
+}
+
+static inline u32 norm_file_page(u32 page_size, u32 *l_size, bool use_default)
+{
+	if (use_default)
+		page_size = DefaultLogPageSize;
+
+	/* Round the file size down to a system page boundary */
+	*l_size &= ~(page_size - 1);
+
+	/* File should contain at least 2 restart pages and MinLogRecordPages pages */
+	if (*l_size < (MinLogRecordPages + 2) * page_size)
+		return 0;
+
+	return page_size;
+}
+
+static bool check_log_rec(const struct LOG_REC_HDR *lr, u32 bytes, u32 tr,
+			  u32 bytes_per_attr_entry)
+{
+	u16 t16;
+
+	if (bytes < sizeof(struct LOG_REC_HDR))
+		return false;
+	if (!tr)
+		return false;
+
+	if ((tr - sizeof(struct RESTART_TABLE)) %
+	    sizeof(struct TRANSACTION_ENTRY))
+		return false;
+
+	if (le16_to_cpu(lr->redo_off) & 7)
+		return false;
+
+	if (le16_to_cpu(lr->undo_off) & 7)
+		return false;
+
+	if (lr->target_attr)
+		goto check_lcns;
+
+	if (is_target_required(le16_to_cpu(lr->redo_op)))
+		return false;
+
+	if (is_target_required(le16_to_cpu(lr->undo_op)))
+		return false;
+
+check_lcns:
+	if (!lr->lcns_follow)
+		goto check_length;
+
+	t16 = le16_to_cpu(lr->target_attr);
+	if ((t16 - sizeof(struct RESTART_TABLE)) % bytes_per_attr_entry)
+		return false;
+
+check_length:
+	if (bytes < lrh_length(lr))
+		return false;
+
+	return true;
+}
+
+static bool check_rstbl(const struct RESTART_TABLE *rt, size_t bytes)
+{
+	u32 ts;
+	u32 i, off;
+	u16 rsize = le16_to_cpu(rt->size);
+	u16 ne = le16_to_cpu(rt->used);
+	u32 ff = le32_to_cpu(rt->first_free);
+	u32 lf = le32_to_cpu(rt->last_free);
+
+	ts = rsize * ne + sizeof(struct RESTART_TABLE);
+
+	if (!rsize || rsize > bytes ||
+	    rsize + sizeof(struct RESTART_TABLE) > bytes || bytes < ts ||
+	    le16_to_cpu(rt->total) > ne || ff > ts || lf > ts ||
+	    (ff && ff < sizeof(struct RESTART_TABLE)) ||
+	    (lf && lf < sizeof(struct RESTART_TABLE))) {
+		return false;
+	}
+
+	/* Verify each entry is either allocated or points
+	 * to a valid offset the table
+	 */
+	for (i = 0; i < ne; i++) {
+		off = le32_to_cpu(*(__le32 *)Add2Ptr(
+			rt, i * rsize + sizeof(struct RESTART_TABLE)));
+
+		if (off != RESTART_ENTRY_ALLOCATED && off &&
+		    (off < sizeof(struct RESTART_TABLE) ||
+		     ((off - sizeof(struct RESTART_TABLE)) % rsize))) {
+			return false;
+		}
+	}
+
+	/* Walk through the list headed by the first entry to make
+	 * sure none of the entries are currently being used
+	 */
+	for (off = ff; off;) {
+		if (off == RESTART_ENTRY_ALLOCATED)
+			return false;
+
+		off = le32_to_cpu(*(__le32 *)Add2Ptr(rt, off));
+	}
+
+	return true;
+}
+
+/*
+ * free_rsttbl_idx
+ *
+ * frees a previously allocated index a Restart Table.
+ */
+static inline void free_rsttbl_idx(struct RESTART_TABLE *rt, u32 off)
+{
+	__le32 *e;
+	u32 lf = le32_to_cpu(rt->last_free);
+	__le32 off_le = cpu_to_le32(off);
+
+	e = Add2Ptr(rt, off);
+
+	if (off < le32_to_cpu(rt->free_goal)) {
+		*e = rt->first_free;
+		rt->first_free = off_le;
+		if (!lf)
+			rt->last_free = off_le;
+	} else {
+		if (lf)
+			*(__le32 *)Add2Ptr(rt, lf) = off_le;
+		else
+			rt->first_free = off_le;
+
+		rt->last_free = off_le;
+		*e = 0;
+	}
+
+	le16_sub_cpu(&rt->total, 1);
+}
+
+static inline struct RESTART_TABLE *init_rsttbl(u16 esize, u16 used)
+{
+	__le32 *e, *last_free;
+	u32 off;
+	u32 bytes = esize * used + sizeof(struct RESTART_TABLE);
+	u32 lf = sizeof(struct RESTART_TABLE) + (used - 1) * esize;
+	struct RESTART_TABLE *t = ntfs_zalloc(bytes);
+
+	t->size = cpu_to_le16(esize);
+	t->used = cpu_to_le16(used);
+	t->free_goal = cpu_to_le32(~0u);
+	t->first_free = cpu_to_le32(sizeof(struct RESTART_TABLE));
+	t->last_free = cpu_to_le32(lf);
+
+	e = (__le32 *)(t + 1);
+	last_free = Add2Ptr(t, lf);
+
+	for (off = sizeof(struct RESTART_TABLE) + esize; e < last_free;
+	     e = Add2Ptr(e, esize), off += esize) {
+		*e = cpu_to_le32(off);
+	}
+	return t;
+}
+
+static inline struct RESTART_TABLE *extend_rsttbl(struct RESTART_TABLE *tbl,
+						  u32 add, u32 free_goal)
+{
+	u16 esize = le16_to_cpu(tbl->size);
+	__le32 osize = cpu_to_le32(bytes_per_rt(tbl));
+	u32 used = le16_to_cpu(tbl->used);
+	struct RESTART_TABLE *rt = init_rsttbl(esize, used + add);
+
+	memcpy(rt + 1, tbl + 1, esize * used);
+
+	rt->free_goal = free_goal == ~0u ?
+				cpu_to_le32(~0u) :
+				cpu_to_le32(sizeof(struct RESTART_TABLE) +
+					    free_goal * esize);
+
+	if (tbl->first_free) {
+		rt->first_free = tbl->first_free;
+		*(__le32 *)Add2Ptr(rt, le32_to_cpu(tbl->last_free)) = osize;
+	} else {
+		rt->first_free = osize;
+	}
+
+	rt->total = tbl->total;
+
+	ntfs_free(tbl);
+	return rt;
+}
+
+/*
+ * alloc_rsttbl_idx
+ *
+ * allocates an index from within a previously initialized Restart Table
+ */
+static inline void *alloc_rsttbl_idx(struct RESTART_TABLE **tbl)
+{
+	u32 off;
+	__le32 *e;
+	struct RESTART_TABLE *t = *tbl;
+
+	if (!t->first_free)
+		*tbl = t = extend_rsttbl(t, 16, ~0u);
+
+	off = le32_to_cpu(t->first_free);
+
+	/* Dequeue this entry and zero it. */
+	e = Add2Ptr(t, off);
+
+	t->first_free = *e;
+
+	memset(e, 0, le16_to_cpu(t->size));
+
+	*e = RESTART_ENTRY_ALLOCATED_LE;
+
+	/* If list is going empty, then we fix the last_free as well. */
+	if (!t->first_free)
+		t->last_free = 0;
+
+	le16_add_cpu(&t->total, 1);
+
+	return Add2Ptr(t, off);
+}
+
+/*
+ * alloc_rsttbl_from_idx
+ *
+ * allocates a specific index from within a previously initialized Restart Table
+ */
+static inline void *alloc_rsttbl_from_idx(struct RESTART_TABLE **tbl, u32 vbo)
+{
+	u32 off;
+	__le32 *e;
+	struct RESTART_TABLE *rt = *tbl;
+	u32 bytes = bytes_per_rt(rt);
+	u16 esize = le16_to_cpu(rt->size);
+
+	/* If the entry is not the table, we will have to extend the table */
+	if (vbo >= bytes) {
+		/*
+		 * extend the size by computing the number of entries between
+		 * the existing size and the desired index and adding
+		 * 1 to that
+		 */
+		u32 bytes2idx = vbo - bytes;
+
+		/* There should always be an integral number of entries being added */
+		/* Now extend the table */
+		*tbl = rt = extend_rsttbl(rt, bytes2idx / esize + 1, bytes);
+		if (!rt)
+			return NULL;
+	}
+
+	/* see if the entry is already allocated, and just return if it is. */
+	e = Add2Ptr(rt, vbo);
+
+	if (*e == RESTART_ENTRY_ALLOCATED_LE)
+		return e;
+
+	/*
+	 * Walk through the table, looking for the entry we're
+	 * interested and the previous entry
+	 */
+	off = le32_to_cpu(rt->first_free);
+	e = Add2Ptr(rt, off);
+
+	if (off == vbo) {
+		/* this is a match */
+		rt->first_free = *e;
+		goto skip_looking;
+	}
+
+	/*
+	 * need to walk through the list looking for the predecessor of our entry
+	 */
+	for (;;) {
+		/* Remember the entry just found */
+		u32 last_off = off;
+		__le32 *last_e = e;
+
+		/* should never run of entries. */
+
+		/* Lookup up the next entry the list */
+		off = le32_to_cpu(*last_e);
+		e = Add2Ptr(rt, off);
+
+		/* If this is our match we are done */
+		if (off == vbo) {
+			*last_e = *e;
+
+			/* If this was the last entry, we update that the table as well */
+			if (le32_to_cpu(rt->last_free) == off)
+				rt->last_free = cpu_to_le32(last_off);
+			break;
+		}
+	}
+
+skip_looking:
+	/* If the list is now empty, we fix the last_free as well */
+	if (!rt->first_free)
+		rt->last_free = 0;
+
+	/* Zero this entry */
+	memset(e, 0, esize);
+	*e = RESTART_ENTRY_ALLOCATED_LE;
+
+	le16_add_cpu(&rt->total, 1);
+
+	return e;
+}
+
+#define RESTART_SINGLE_PAGE_IO cpu_to_le16(0x0001)
+
+#define NTFSLOG_WRAPPED 0x00000001
+#define NTFSLOG_MULTIPLE_PAGE_IO 0x00000002
+#define NTFSLOG_NO_LAST_LSN 0x00000004
+#define NTFSLOG_REUSE_TAIL 0x00000010
+#define NTFSLOG_NO_OLDEST_LSN 0x00000020
+
+/*
+ * Helper struct to work with NTFS $LogFile
+ */
+struct ntfs_log {
+	struct ntfs_inode *ni;
+
+	u32 l_size;
+	u32 sys_page_size;
+	u32 sys_page_mask;
+	u32 page_size;
+	u32 page_mask; // page_size - 1
+	u8 page_bits;
+	struct RECORD_PAGE_HDR *one_page_buf;
+
+	struct RESTART_TABLE *open_attr_tbl;
+	u32 transaction_id;
+	u32 clst_per_page;
+
+	u32 first_page;
+	u32 next_page;
+	u32 ra_off;
+	u32 data_off;
+	u32 restart_size;
+	u32 data_size;
+	u16 record_header_len;
+	u64 seq_num;
+	u32 seq_num_bits;
+	u32 file_data_bits;
+	u32 seq_num_mask; /* (1 << file_data_bits) - 1 */
+
+	struct RESTART_AREA *ra; /* in-memory image of the next restart area */
+	u32 ra_size; /* the usable size of the restart area */
+
+	/*
+	 * If true, then the in-memory restart area is to be written
+	 * to the first position on the disk
+	 */
+	bool init_ra;
+	bool set_dirty; /* true if we need to set dirty flag */
+
+	u64 oldest_lsn;
+
+	u32 oldest_lsn_off;
+	u64 last_lsn;
+
+	u32 total_avail;
+	u32 total_avail_pages;
+	u32 total_undo_commit;
+	u32 max_current_avail;
+	u32 current_avail;
+	u32 reserved;
+
+	short major_ver;
+	short minor_ver;
+
+	u32 l_flags; /* See NTFSLOG_XXX */
+	u32 current_openlog_count; /* On-disk value for open_log_count */
+
+	struct CLIENT_ID client_id;
+	u32 client_undo_commit;
+};
+
+static inline u32 lsn_to_vbo(struct ntfs_log *log, const u64 lsn)
+{
+	u32 vbo = (lsn << log->seq_num_bits) >> (log->seq_num_bits - 3);
+
+	return vbo;
+}
+
+/* compute the offset in the log file of the next log page */
+static inline u32 next_page_off(struct ntfs_log *log, u32 off)
+{
+	off = (off & ~log->sys_page_mask) + log->page_size;
+	return off >= log->l_size ? log->first_page : off;
+}
+
+static inline u32 lsn_to_page_off(struct ntfs_log *log, u64 lsn)
+{
+	return (((u32)lsn) << 3) & log->page_mask;
+}
+
+static inline u64 vbo_to_lsn(struct ntfs_log *log, u32 off, u64 Seq)
+{
+	return (off >> 3) + (Seq << log->file_data_bits);
+}
+
+static inline bool is_lsn_in_file(struct ntfs_log *log, u64 lsn)
+{
+	return lsn >= log->oldest_lsn &&
+	       lsn <= le64_to_cpu(log->ra->current_lsn);
+}
+
+static inline u32 hdr_file_off(struct ntfs_log *log,
+			       struct RECORD_PAGE_HDR *hdr)
+{
+	if (log->major_ver < 2)
+		return le64_to_cpu(hdr->rhdr.lsn);
+
+	return le32_to_cpu(hdr->file_off);
+}
+
+static inline u64 base_lsn(struct ntfs_log *log,
+			   const struct RECORD_PAGE_HDR *hdr, u64 lsn)
+{
+	u64 h_lsn = le64_to_cpu(hdr->rhdr.lsn);
+	u64 ret = (((h_lsn >> log->file_data_bits) +
+		    (lsn < (lsn_to_vbo(log, h_lsn) & ~log->page_mask) ? 1 : 0))
+		   << log->file_data_bits) +
+		  ((((is_log_record_end(hdr) &&
+		      h_lsn <= le64_to_cpu(hdr->record_hdr.last_end_lsn)) ?
+			     le16_to_cpu(hdr->record_hdr.next_record_off) :
+			     log->page_size) +
+		    lsn) >>
+		   3);
+
+	return ret;
+}
+
+static inline bool verify_client_lsn(struct ntfs_log *log,
+				     const struct CLIENT_REC *client, u64 lsn)
+{
+	return lsn >= le64_to_cpu(client->oldest_lsn) &&
+	       lsn <= le64_to_cpu(log->ra->current_lsn) && lsn;
+}
+
+struct restart_info {
+	u64 last_lsn;
+	struct RESTART_HDR *r_page;
+	u32 vbo;
+	bool chkdsk_was_run;
+	bool valid_page;
+	bool initialized;
+	bool restart;
+};
+
+static int read_log_page(struct ntfs_log *log, u32 vbo,
+			 struct RECORD_PAGE_HDR **buffer, bool *usa_error)
+{
+	int err = 0;
+	u32 page_idx = vbo >> log->page_bits;
+	u32 page_off = vbo & log->page_mask;
+	u32 bytes = log->page_size - page_off;
+	void *to_free = NULL;
+	u32 page_vbo = page_idx << log->page_bits;
+	struct RECORD_PAGE_HDR *page_buf;
+	struct ntfs_inode *ni = log->ni;
+	bool bBAAD;
+
+	if (vbo >= log->l_size)
+		return -EINVAL;
+
+	if (!*buffer) {
+		to_free = ntfs_malloc(bytes);
+		if (!to_free)
+			return -ENOMEM;
+		*buffer = to_free;
+	}
+
+	page_buf = page_off ? log->one_page_buf : *buffer;
+
+	err = ntfs_read_run_nb(ni->mi.sbi, &ni->file.run, page_vbo, page_buf,
+			       log->page_size, NULL);
+	if (err)
+		goto out;
+
+	if (page_buf->rhdr.sign != NTFS_FFFF_SIGNATURE)
+		ntfs_fix_post_read(&page_buf->rhdr, PAGE_SIZE, false);
+
+	if (page_buf != *buffer)
+		memcpy(*buffer, Add2Ptr(page_buf, page_off), bytes);
+
+	bBAAD = page_buf->rhdr.sign == NTFS_BAAD_SIGNATURE;
+
+	if (usa_error)
+		*usa_error = bBAAD;
+	/* Check that the update sequence array for this page is valid */
+	/* If we don't allow errors, raise an error status */
+	else if (bBAAD)
+		err = -EINVAL;
+
+out:
+	if (err && to_free) {
+		ntfs_free(to_free);
+		*buffer = NULL;
+	}
+
+	return err;
+}
+
+/*
+ * log_read_rst
+ *
+ * it walks through 512 blocks of the file looking for a valid restart page header
+ * It will stop the first time we find a valid page header
+ */
+static int log_read_rst(struct ntfs_log *log, u32 l_size, bool first,
+			struct restart_info *info)
+{
+	u32 skip, vbo;
+	struct RESTART_HDR *r_page = ntfs_malloc(DefaultLogPageSize);
+
+	if (!r_page)
+		return -ENOMEM;
+
+	memset(info, 0, sizeof(struct restart_info));
+
+	/* Determine which restart area we are looking for */
+	if (first) {
+		vbo = 0;
+		skip = 512;
+	} else {
+		vbo = 512;
+		skip = 0;
+	}
+
+	/* loop continuously until we succeed */
+	for (; vbo < l_size; vbo = 2 * vbo + skip, skip = 0) {
+		bool usa_error;
+		u32 sys_page_size;
+		bool brst, bchk;
+		struct RESTART_AREA *ra;
+
+		/* Read a page header at the current offset */
+		if (read_log_page(log, vbo, (struct RECORD_PAGE_HDR **)&r_page,
+				  &usa_error)) {
+			/* ignore any errors */
+			continue;
+		}
+
+		/* exit if the signature is a log record page */
+		if (r_page->rhdr.sign == NTFS_RCRD_SIGNATURE) {
+			info->initialized = true;
+			break;
+		}
+
+		brst = r_page->rhdr.sign == NTFS_RSTR_SIGNATURE;
+		bchk = r_page->rhdr.sign == NTFS_CHKD_SIGNATURE;
+
+		if (!bchk && !brst) {
+			if (r_page->rhdr.sign != NTFS_FFFF_SIGNATURE) {
+				/*
+				 * Remember if the signature does not
+				 * indicate uninitialized file
+				 */
+				info->initialized = true;
+			}
+			continue;
+		}
+
+		ra = NULL;
+		info->valid_page = false;
+		info->initialized = true;
+		info->vbo = vbo;
+
+		/* Let's check the restart area if this is a valid page */
+		if (!is_rst_page_hdr_valid(vbo, r_page))
+			goto check_result;
+		ra = Add2Ptr(r_page, le16_to_cpu(r_page->ra_off));
+
+		if (!is_rst_area_valid(r_page))
+			goto check_result;
+
+		/*
+		 * We have a valid restart page header and restart area.
+		 * If chkdsk was run or we have no clients then we have
+		 * no more checking to do
+		 */
+		if (bchk || ra->client_idx[1] == LFS_NO_CLIENT_LE) {
+			info->valid_page = true;
+			goto check_result;
+		}
+
+		/* Read the entire restart area */
+		sys_page_size = le32_to_cpu(r_page->sys_page_size);
+		if (DefaultLogPageSize != sys_page_size) {
+			ntfs_free(r_page);
+			r_page = ntfs_zalloc(sys_page_size);
+			if (!r_page)
+				return -ENOMEM;
+
+			if (read_log_page(log, vbo,
+					  (struct RECORD_PAGE_HDR **)&r_page,
+					  &usa_error)) {
+				/* ignore any errors */
+				ntfs_free(r_page);
+				r_page = NULL;
+				continue;
+			}
+		}
+
+		if (is_client_area_valid(r_page, usa_error)) {
+			info->valid_page = true;
+			ra = Add2Ptr(r_page, le16_to_cpu(r_page->ra_off));
+		}
+
+check_result:
+		/* If chkdsk was run then update the caller's values and return */
+		if (r_page->rhdr.sign == NTFS_CHKD_SIGNATURE) {
+			info->chkdsk_was_run = true;
+			info->last_lsn = le64_to_cpu(r_page->rhdr.lsn);
+			info->restart = true;
+			info->r_page = r_page;
+			return 0;
+		}
+
+		/* If we have a valid page then copy the values we need from it */
+		if (info->valid_page) {
+			info->last_lsn = le64_to_cpu(ra->current_lsn);
+			info->restart = true;
+			info->r_page = r_page;
+			return 0;
+		}
+	}
+
+	ntfs_free(r_page);
+
+	return 0;
+}
+
+/*
+ * log_init_pg_hdr
+ *
+ * init "log' from restart page header
+ */
+static void log_init_pg_hdr(struct ntfs_log *log, u32 sys_page_size,
+			    u32 page_size, u16 major_ver, u16 minor_ver)
+{
+	log->sys_page_size = sys_page_size;
+	log->sys_page_mask = sys_page_size - 1;
+	log->page_size = page_size;
+	log->page_mask = page_size - 1;
+	log->page_bits = blksize_bits(page_size);
+
+	log->clst_per_page = log->page_size >> log->ni->mi.sbi->cluster_bits;
+	if (!log->clst_per_page)
+		log->clst_per_page = 1;
+
+	log->first_page = major_ver >= 2 ?
+				  0x22 * page_size :
+				  ((sys_page_size << 1) + (page_size << 1));
+	log->major_ver = major_ver;
+	log->minor_ver = minor_ver;
+}
+
+/*
+ * log_create
+ *
+ * init "log" in cases when we don't have a restart area to use
+ */
+static void log_create(struct ntfs_log *log, u32 l_size, const u64 last_lsn,
+		       u32 open_log_count, bool wrapped, bool use_multi_page)
+{
+	log->l_size = l_size;
+	/* All file offsets must be quadword aligned */
+	log->file_data_bits = blksize_bits(l_size) - 3;
+	log->seq_num_mask = (8 << log->file_data_bits) - 1;
+	log->seq_num_bits = sizeof(u64) * 8 - log->file_data_bits;
+	log->seq_num = (last_lsn >> log->file_data_bits) + 2;
+	log->next_page = log->first_page;
+	log->oldest_lsn = log->seq_num << log->file_data_bits;
+	log->oldest_lsn_off = 0;
+	log->last_lsn = log->oldest_lsn;
+
+	log->l_flags |= NTFSLOG_NO_LAST_LSN | NTFSLOG_NO_OLDEST_LSN;
+
+	/* Set the correct flags for the I/O and indicate if we have wrapped */
+	if (wrapped)
+		log->l_flags |= NTFSLOG_WRAPPED;
+
+	if (use_multi_page)
+		log->l_flags |= NTFSLOG_MULTIPLE_PAGE_IO;
+
+	/* Compute the log page values */
+	log->data_off = QuadAlign(
+		offsetof(struct RECORD_PAGE_HDR, fixups) +
+		sizeof(short) * ((log->page_size >> SECTOR_SHIFT) + 1));
+	log->data_size = log->page_size - log->data_off;
+	log->record_header_len = sizeof(struct LFS_RECORD_HDR);
+
+	/* Remember the different page sizes for reservation */
+	log->reserved = log->data_size - log->record_header_len;
+
+	/* Compute the restart page values. */
+	log->ra_off = QuadAlign(
+		offsetof(struct RESTART_HDR, fixups) +
+		sizeof(short) * ((log->sys_page_size >> SECTOR_SHIFT) + 1));
+	log->restart_size = log->sys_page_size - log->ra_off;
+	log->ra_size = offsetof(struct RESTART_AREA, clients) +
+		       sizeof(struct CLIENT_REC);
+	log->current_openlog_count = open_log_count;
+
+	/*
+	 * The total available log file space is the number of
+	 * log file pages times the space available on each page
+	 */
+	log->total_avail_pages = log->l_size - log->first_page;
+	log->total_avail = log->total_avail_pages >> log->page_bits;
+
+	/*
+	 * We assume that we can't use the end of the page less than
+	 * the file record size
+	 * Then we won't need to reserve more than the caller asks for
+	 */
+	log->max_current_avail = log->total_avail * log->reserved;
+	log->total_avail = log->total_avail * log->data_size;
+	log->current_avail = log->max_current_avail;
+}
+
+/*
+ * log_create_ra
+ *
+ * This routine is called to fill a restart area from the values stored in 'log'
+ */
+static struct RESTART_AREA *log_create_ra(struct ntfs_log *log)
+{
+	struct CLIENT_REC *cr;
+	struct RESTART_AREA *ra = ntfs_zalloc(log->restart_size);
+
+	if (!ra)
+		return NULL;
+
+	ra->current_lsn = cpu_to_le64(log->last_lsn);
+	ra->log_clients = cpu_to_le16(1);
+	ra->client_idx[1] = LFS_NO_CLIENT_LE;
+	if (log->l_flags & NTFSLOG_MULTIPLE_PAGE_IO)
+		ra->flags = RESTART_SINGLE_PAGE_IO;
+	ra->seq_num_bits = cpu_to_le32(log->seq_num_bits);
+	ra->ra_len = cpu_to_le16(log->ra_size);
+	ra->client_off = cpu_to_le16(offsetof(struct RESTART_AREA, clients));
+	ra->l_size = cpu_to_le64(log->l_size);
+	ra->rec_hdr_len = cpu_to_le16(log->record_header_len);
+	ra->data_off = cpu_to_le16(log->data_off);
+	ra->open_log_count = cpu_to_le32(log->current_openlog_count + 1);
+
+	cr = ra->clients;
+
+	cr->prev_client = LFS_NO_CLIENT_LE;
+	cr->next_client = LFS_NO_CLIENT_LE;
+
+	return ra;
+}
+
+static u32 final_log_off(struct ntfs_log *log, u64 lsn, u32 data_len)
+{
+	u32 base_vbo = lsn << 3;
+	u32 final_log_off = (base_vbo & log->seq_num_mask) & ~log->page_mask;
+	u32 page_off = base_vbo & log->page_mask;
+	u32 tail = log->page_size - page_off;
+
+	page_off -= 1;
+
+	/* Add the length of the header */
+	data_len += log->record_header_len;
+
+	/*
+	 * If this lsn is contained this log page we are done
+	 * Otherwise we need to walk through several log pages
+	 */
+	if (data_len > tail) {
+		data_len -= tail;
+		tail = log->data_size;
+		page_off = log->data_off - 1;
+
+		for (;;) {
+			final_log_off = next_page_off(log, final_log_off);
+
+			/* We are done if the remaining bytes fit on this page */
+			if (data_len <= tail)
+				break;
+			data_len -= tail;
+		}
+	}
+
+	/*
+	 * We add the remaining bytes to our starting position on this page
+	 * and then add that value to the file offset of this log page
+	 */
+	return final_log_off + data_len + page_off;
+}
+
+static int next_log_lsn(struct ntfs_log *log, const struct LFS_RECORD_HDR *rh,
+			u64 *lsn)
+{
+	int err;
+	u64 this_lsn = le64_to_cpu(rh->this_lsn);
+	u32 vbo = lsn_to_vbo(log, this_lsn);
+	u32 end =
+		final_log_off(log, this_lsn, le32_to_cpu(rh->client_data_len));
+	u32 hdr_off = end & ~log->sys_page_mask;
+	u64 seq = this_lsn >> log->file_data_bits;
+	struct RECORD_PAGE_HDR *page = NULL;
+
+	/* Remember if we wrapped */
+	if (end <= vbo)
+		seq += 1;
+
+	/* log page header for this page */
+	err = read_log_page(log, hdr_off, &page, NULL);
+	if (err)
+		return err;
+
+	/*
+	 * If the lsn we were given was not the last lsn on this page,
+	 * then the starting offset for the next lsn is on a quad word
+	 * boundary following the last file offset for the current lsn
+	 * Otherwise the file offset is the start of the data on the next page
+	 */
+	if (this_lsn == le64_to_cpu(page->rhdr.lsn)) {
+		/* If we wrapped, we need to increment the sequence number */
+		hdr_off = next_page_off(log, hdr_off);
+		if (hdr_off == log->first_page)
+			seq += 1;
+
+		vbo = hdr_off + log->data_off;
+	} else {
+		vbo = QuadAlign(end);
+	}
+
+	/* Compute the lsn based on the file offset and the sequence count */
+	*lsn = vbo_to_lsn(log, vbo, seq);
+
+	/*
+	 * If this lsn is within the legal range for the file, we return true
+	 * Otherwise false indicates that there are no more lsn's
+	 */
+	if (!is_lsn_in_file(log, *lsn))
+		*lsn = 0;
+
+	ntfs_free(page);
+
+	return 0;
+}
+
+/*
+ * current_log_avail
+ *
+ * calculate the number of bytes available for log records
+ */
+static u32 current_log_avail(struct ntfs_log *log)
+{
+	u32 oldest_off, next_free_off, free_bytes;
+
+	if (log->l_flags & NTFSLOG_NO_LAST_LSN) {
+		/* The entire file is available */
+		return log->max_current_avail;
+	}
+
+	/*
+	 * If there is a last lsn the restart area then we know that we will
+	 * have to compute the free range
+	 * If there is no oldest lsn then start at the first page of the file
+	 */
+	oldest_off = (log->l_flags & NTFSLOG_NO_OLDEST_LSN) ?
+			     log->first_page :
+			     (log->oldest_lsn_off & ~log->sys_page_mask);
+
+	/*
+	 * We will use the next log page offset to compute the next free page\
+	 * If we are going to reuse this page go to the next page
+	 * If we are at the first page then use the end of the file
+	 */
+	next_free_off = (log->l_flags & NTFSLOG_REUSE_TAIL) ?
+				log->next_page + log->page_size :
+				log->next_page == log->first_page ?
+				log->l_size :
+				log->next_page;
+
+	/* If the two offsets are the same then there is no available space */
+	if (oldest_off == next_free_off)
+		return 0;
+	/*
+	 * If the free offset follows the oldest offset then subtract
+	 * this range from the total available pages
+	 */
+	free_bytes =
+		oldest_off < next_free_off ?
+			log->total_avail_pages - (next_free_off - oldest_off) :
+			oldest_off - next_free_off;
+
+	free_bytes >>= log->page_bits;
+	return free_bytes * log->reserved;
+}
+
+static bool check_subseq_log_page(struct ntfs_log *log,
+				  const struct RECORD_PAGE_HDR *rp, u32 vbo,
+				  u64 seq)
+{
+	u64 lsn_seq;
+	const struct NTFS_RECORD_HEADER *rhdr = &rp->rhdr;
+	u64 lsn = le64_to_cpu(rhdr->lsn);
+
+	if (rhdr->sign == NTFS_FFFF_SIGNATURE || !rhdr->sign)
+		return false;
+
+	/*
+	 * If the last lsn on the page occurs was written after the page
+	 * that caused the original error then we have a fatal error
+	 */
+	lsn_seq = lsn >> log->file_data_bits;
+
+	/*
+	 * If the sequence number for the lsn the page is equal or greater
+	 * than lsn we expect, then this is a subsequent write
+	 */
+	return lsn_seq >= seq ||
+	       (lsn_seq == seq - 1 && log->first_page == vbo &&
+		vbo != (lsn_to_vbo(log, lsn) & ~log->page_mask));
+}
+
+/*
+ * last_log_lsn
+ *
+ * This routine walks through the log pages for a file, searching for the
+ * last log page written to the file
+ */
+static int last_log_lsn(struct ntfs_log *log)
+{
+	int err;
+	bool usa_error = false;
+	bool replace_page = false;
+	bool reuse_page = log->l_flags & NTFSLOG_REUSE_TAIL;
+	bool wrapped_file, wrapped;
+
+	u32 page_cnt = 1, page_pos = 1;
+	u32 page_off = 0, page_off1 = 0, saved_off = 0;
+	u32 final_off, second_off, final_off_prev = 0, second_off_prev = 0;
+	u32 first_file_off = 0, second_file_off = 0;
+	u32 part_io_count = 0;
+	u32 tails = 0;
+	u32 this_off, curpage_off, nextpage_off, remain_pages;
+
+	u64 expected_seq, seq_base = 0, lsn_base = 0;
+	u64 best_lsn, best_lsn1, best_lsn2;
+	u64 lsn_cur, lsn1, lsn2;
+	u64 last_ok_lsn = reuse_page ? log->last_lsn : 0;
+
+	u16 cur_pos, best_page_pos;
+
+	struct RECORD_PAGE_HDR *page = NULL;
+	struct RECORD_PAGE_HDR *tst_page = NULL;
+	struct RECORD_PAGE_HDR *first_tail = NULL;
+	struct RECORD_PAGE_HDR *second_tail = NULL;
+	struct RECORD_PAGE_HDR *tail_page = NULL;
+	struct RECORD_PAGE_HDR *second_tail_prev = NULL,
+			       *first_tail_prev = NULL;
+	struct RECORD_PAGE_HDR *page_bufs = NULL;
+	struct RECORD_PAGE_HDR *best_page;
+
+	if (log->major_ver >= 2) {
+		final_off = 0x02 * log->page_size;
+		second_off = 0x12 * log->page_size;
+
+		// 0x10 == 0x12 - 0x2
+		page_bufs = ntfs_malloc(log->page_size * 0x10);
+		if (!page_bufs)
+			return -ENOMEM;
+	} else {
+		second_off = log->first_page - log->page_size;
+		final_off = second_off - log->page_size;
+	}
+
+next_tail:
+	/* Read second tail page (at pos 3/0x12000) */
+	if (read_log_page(log, second_off, &second_tail, &usa_error) ||
+	    usa_error || second_tail->rhdr.sign != NTFS_RCRD_SIGNATURE) {
+		ntfs_free(second_tail);
+		second_tail = NULL;
+		second_file_off = 0;
+		lsn2 = 0;
+	} else {
+		second_file_off = hdr_file_off(log, second_tail);
+		lsn2 = le64_to_cpu(second_tail->record_hdr.last_end_lsn);
+	}
+
+	/* Read first tail page (at pos 2/0x2000 ) */
+	if (read_log_page(log, final_off, &first_tail, &usa_error) ||
+	    usa_error || first_tail->rhdr.sign != NTFS_RCRD_SIGNATURE) {
+		ntfs_free(first_tail);
+		first_tail = NULL;
+		first_file_off = 0;
+		lsn1 = 0;
+	} else {
+		first_file_off = hdr_file_off(log, first_tail);
+		lsn1 = le64_to_cpu(first_tail->record_hdr.last_end_lsn);
+	}
+
+	if (log->major_ver < 2) {
+		int best_page;
+
+		first_tail_prev = first_tail;
+		final_off_prev = first_file_off;
+		second_tail_prev = second_tail;
+		second_off_prev = second_file_off;
+		tails = 1;
+
+		if (!first_tail && !second_tail)
+			goto tail_read;
+
+		if (first_tail && second_tail)
+			best_page = lsn1 < lsn2 ? 1 : 0;
+		else if (first_tail)
+			best_page = 0;
+		else
+			best_page = 1;
+
+		page_off = best_page ? second_file_off : first_file_off;
+		seq_base = (best_page ? lsn2 : lsn1) >> log->file_data_bits;
+		goto tail_read;
+	}
+
+	best_lsn1 = first_tail ? base_lsn(log, first_tail, first_file_off) : 0;
+	best_lsn2 =
+		second_tail ? base_lsn(log, second_tail, second_file_off) : 0;
+
+	if (first_tail && second_tail) {
+		if (best_lsn1 > best_lsn2) {
+			best_lsn = best_lsn1;
+			best_page = first_tail;
+			this_off = first_file_off;
+		} else {
+			best_lsn = best_lsn2;
+			best_page = second_tail;
+			this_off = second_file_off;
+		}
+	} else if (first_tail) {
+		best_lsn = best_lsn1;
+		best_page = first_tail;
+		this_off = first_file_off;
+	} else if (second_tail) {
+		best_lsn = best_lsn2;
+		best_page = second_tail;
+		this_off = second_file_off;
+	} else {
+		goto tail_read;
+	}
+
+	best_page_pos = le16_to_cpu(best_page->page_pos);
+
+	if (!tails) {
+		if (best_page_pos == page_pos) {
+			seq_base = best_lsn >> log->file_data_bits;
+			saved_off = page_off = le32_to_cpu(best_page->file_off);
+			lsn_base = best_lsn;
+
+			memmove(page_bufs, best_page, log->page_size);
+
+			page_cnt = le16_to_cpu(best_page->page_count);
+			if (page_cnt > 1)
+				page_pos += 1;
+
+			tails = 1;
+		}
+	} else if (seq_base == (best_lsn >> log->file_data_bits) &&
+		   saved_off + log->page_size == this_off &&
+		   lsn_base < best_lsn &&
+		   (page_pos != page_cnt || best_page_pos == page_pos ||
+		    best_page_pos == 1) &&
+		   (page_pos >= page_cnt || best_page_pos == page_pos)) {
+		u16 bppc = le16_to_cpu(best_page->page_count);
+
+		saved_off += log->page_size;
+		lsn_base = best_lsn;
+
+		memmove(Add2Ptr(page_bufs, tails * log->page_size), best_page,
+			log->page_size);
+
+		tails += 1;
+
+		if (best_page_pos != bppc) {
+			page_cnt = bppc;
+			page_pos = best_page_pos;
+
+			if (page_cnt > 1)
+				page_pos += 1;
+		} else {
+			page_pos = page_cnt = 1;
+		}
+	} else {
+		ntfs_free(first_tail);
+		ntfs_free(second_tail);
+		goto tail_read;
+	}
+
+	ntfs_free(first_tail_prev);
+	first_tail_prev = first_tail;
+	final_off_prev = first_file_off;
+	first_tail = NULL;
+
+	ntfs_free(second_tail_prev);
+	second_tail_prev = second_tail;
+	second_off_prev = second_file_off;
+	second_tail = NULL;
+
+	final_off += log->page_size;
+	second_off += log->page_size;
+
+	if (tails < 0x10)
+		goto next_tail;
+tail_read:
+	first_tail = first_tail_prev;
+	final_off = final_off_prev;
+
+	second_tail = second_tail_prev;
+	second_off = second_off_prev;
+
+	page_cnt = page_pos = 1;
+
+	curpage_off = seq_base == log->seq_num ? min(log->next_page, page_off) :
+						 log->next_page;
+
+	wrapped_file =
+		curpage_off == log->first_page &&
+		!(log->l_flags & (NTFSLOG_NO_LAST_LSN | NTFSLOG_REUSE_TAIL));
+
+	expected_seq = wrapped_file ? (log->seq_num + 1) : log->seq_num;
+
+	nextpage_off = curpage_off;
+
+next_page:
+	tail_page = NULL;
+	/* Read the next log page */
+	err = read_log_page(log, curpage_off, &page, &usa_error);
+
+	/* Compute the next log page offset the file */
+	nextpage_off = next_page_off(log, curpage_off);
+	wrapped = nextpage_off == log->first_page;
+
+	if (tails > 1) {
+		struct RECORD_PAGE_HDR *cur_page =
+			Add2Ptr(page_bufs, curpage_off - page_off);
+
+		if (curpage_off == saved_off) {
+			tail_page = cur_page;
+			goto use_tail_page;
+		}
+
+		if (page_off > curpage_off || curpage_off >= saved_off)
+			goto use_tail_page;
+
+		if (page_off1)
+			goto use_cur_page;
+
+		if (!err && !usa_error &&
+		    page->rhdr.sign == NTFS_RCRD_SIGNATURE &&
+		    cur_page->rhdr.lsn == page->rhdr.lsn &&
+		    cur_page->record_hdr.next_record_off ==
+			    page->record_hdr.next_record_off &&
+		    ((page_pos == page_cnt &&
+		      le16_to_cpu(page->page_pos) == 1) ||
+		     (page_pos != page_cnt &&
+		      le16_to_cpu(page->page_pos) == page_pos + 1 &&
+		      le16_to_cpu(page->page_count) == page_cnt))) {
+			cur_page = NULL;
+			goto use_tail_page;
+		}
+
+		page_off1 = page_off;
+
+use_cur_page:
+
+		lsn_cur = le64_to_cpu(cur_page->rhdr.lsn);
+
+		if (last_ok_lsn !=
+			    le64_to_cpu(cur_page->record_hdr.last_end_lsn) &&
+		    ((lsn_cur >> log->file_data_bits) +
+		     ((curpage_off <
+		       (lsn_to_vbo(log, lsn_cur) & ~log->page_mask)) ?
+			      1 :
+			      0)) != expected_seq) {
+			goto check_tail;
+		}
+
+		if (!is_log_record_end(cur_page)) {
+			tail_page = NULL;
+			last_ok_lsn = lsn_cur;
+			goto next_page_1;
+		}
+
+		log->seq_num = expected_seq;
+		log->l_flags &= ~NTFSLOG_NO_LAST_LSN;
+		log->last_lsn = le64_to_cpu(cur_page->record_hdr.last_end_lsn);
+		log->ra->current_lsn = cur_page->record_hdr.last_end_lsn;
+
+		if (log->record_header_len <=
+		    log->page_size -
+			    le16_to_cpu(cur_page->record_hdr.next_record_off)) {
+			log->l_flags |= NTFSLOG_REUSE_TAIL;
+			log->next_page = curpage_off;
+		} else {
+			log->l_flags &= ~NTFSLOG_REUSE_TAIL;
+			log->next_page = nextpage_off;
+		}
+
+		if (wrapped_file)
+			log->l_flags |= NTFSLOG_WRAPPED;
+
+		last_ok_lsn = le64_to_cpu(cur_page->record_hdr.last_end_lsn);
+		goto next_page_1;
+	}
+
+	/*
+	 * If we are at the expected first page of a transfer check to see
+	 * if either tail copy is at this offset
+	 * If this page is the last page of a transfer, check if we wrote
+	 * a subsequent tail copy
+	 */
+	if (page_cnt == page_pos || page_cnt == page_pos + 1) {
+		/*
+		 * Check if the offset matches either the first or second
+		 * tail copy. It is possible it will match both
+		 */
+		if (curpage_off == final_off)
+			tail_page = first_tail;
+
+		/*
+		 * If we already matched on the first page then
+		 * check the ending lsn's.
+		 */
+		if (curpage_off == second_off) {
+			if (!tail_page ||
+			    (second_tail &&
+			     le64_to_cpu(second_tail->record_hdr.last_end_lsn) >
+				     le64_to_cpu(first_tail->record_hdr
+							 .last_end_lsn))) {
+				tail_page = second_tail;
+			}
+		}
+	}
+
+use_tail_page:
+	if (tail_page) {
+		/* we have a candidate for a tail copy */
+		lsn_cur = le64_to_cpu(tail_page->record_hdr.last_end_lsn);
+
+		if (last_ok_lsn < lsn_cur) {
+			/*
+			 * If the sequence number is not expected,
+			 * then don't use the tail copy
+			 */
+			if (expected_seq != (lsn_cur >> log->file_data_bits))
+				tail_page = NULL;
+		} else if (last_ok_lsn > lsn_cur) {
+			/*
+			 * If the last lsn is greater than the one on
+			 * this page then forget this tail
+			 */
+			tail_page = NULL;
+		}
+	}
+
+	/* If we have an error on the current page, we will break of this loop */
+	if (err || usa_error)
+		goto check_tail;
+
+	/*
+	 * Done if the last lsn on this page doesn't match the previous known
+	 * last lsn or the sequence number is not expected
+	 */
+	lsn_cur = le64_to_cpu(page->rhdr.lsn);
+	if (last_ok_lsn != lsn_cur &&
+	    expected_seq != (lsn_cur >> log->file_data_bits)) {
+		goto check_tail;
+	}
+
+	/*
+	 * Check that the page position and page count values are correct
+	 * If this is the first page of a transfer the position must be 1
+	 * and the count will be unknown
+	 */
+	if (page_cnt == page_pos) {
+		if (page->page_pos != cpu_to_le16(1) &&
+		    (!reuse_page || page->page_pos != page->page_count)) {
+			/*
+			 * If the current page is the first page we are
+			 * looking at and we are reusing this page then
+			 * it can be either the first or last page of a
+			 * transfer. Otherwise it can only be the first.
+			 */
+			goto check_tail;
+		}
+	} else if (le16_to_cpu(page->page_count) != page_cnt ||
+		   le16_to_cpu(page->page_pos) != page_pos + 1) {
+		/*
+		 * The page position better be 1 more than the last page
+		 * position and the page count better match
+		 */
+		goto check_tail;
+	}
+
+	/*
+	 * We have a valid page the file and may have a valid page
+	 * the tail copy area
+	 * If the tail page was written after the page the file then
+	 * break of the loop
+	 */
+	if (tail_page &&
+	    le64_to_cpu(tail_page->record_hdr.last_end_lsn) > lsn_cur) {
+		/* Remember if we will replace the page */
+		replace_page = true;
+		goto check_tail;
+	}
+
+	tail_page = NULL;
+
+	if (is_log_record_end(page)) {
+		/*
+		 * Since we have read this page we know the sequence number
+		 * is the same as our expected value
+		 */
+		log->seq_num = expected_seq;
+		log->last_lsn = le64_to_cpu(page->record_hdr.last_end_lsn);
+		log->ra->current_lsn = page->record_hdr.last_end_lsn;
+		log->l_flags &= ~NTFSLOG_NO_LAST_LSN;
+
+		/*
+		 * If there is room on this page for another header then
+		 * remember we want to reuse the page
+		 */
+		if (log->record_header_len <=
+		    log->page_size -
+			    le16_to_cpu(page->record_hdr.next_record_off)) {
+			log->l_flags |= NTFSLOG_REUSE_TAIL;
+			log->next_page = curpage_off;
+		} else {
+			log->l_flags &= ~NTFSLOG_REUSE_TAIL;
+			log->next_page = nextpage_off;
+		}
+
+		/* Remember if we wrapped the log file */
+		if (wrapped_file)
+			log->l_flags |= NTFSLOG_WRAPPED;
+	}
+
+	/*
+	 * Remember the last page count and position.
+	 * Also remember the last known lsn
+	 */
+	page_cnt = le16_to_cpu(page->page_count);
+	page_pos = le16_to_cpu(page->page_pos);
+	last_ok_lsn = le64_to_cpu(page->rhdr.lsn);
+
+next_page_1:
+
+	if (wrapped) {
+		expected_seq += 1;
+		wrapped_file = 1;
+	}
+
+	curpage_off = nextpage_off;
+	ntfs_free(page);
+	page = NULL;
+	reuse_page = 0;
+	goto next_page;
+
+check_tail:
+	if (tail_page) {
+		log->seq_num = expected_seq;
+		log->last_lsn = le64_to_cpu(tail_page->record_hdr.last_end_lsn);
+		log->ra->current_lsn = tail_page->record_hdr.last_end_lsn;
+		log->l_flags &= ~NTFSLOG_NO_LAST_LSN;
+
+		if (log->page_size -
+			    le16_to_cpu(
+				    tail_page->record_hdr.next_record_off) >=
+		    log->record_header_len) {
+			log->l_flags |= NTFSLOG_REUSE_TAIL;
+			log->next_page = curpage_off;
+		} else {
+			log->l_flags &= ~NTFSLOG_REUSE_TAIL;
+			log->next_page = nextpage_off;
+		}
+
+		if (wrapped)
+			log->l_flags |= NTFSLOG_WRAPPED;
+	}
+
+	/* Remember that the partial IO will start at the next page */
+	second_off = nextpage_off;
+
+	/*
+	 * If the next page is the first page of the file then update
+	 * the sequence number for log records which begon the next page
+	 */
+	if (wrapped)
+		expected_seq += 1;
+
+	/*
+	 * If we have a tail copy or are performing single page I/O we can
+	 * immediately look at the next page
+	 */
+	if (replace_page || (log->ra->flags & RESTART_SINGLE_PAGE_IO)) {
+		page_cnt = 2;
+		page_pos = 1;
+		goto check_valid;
+	}
+
+	if (page_pos != page_cnt)
+		goto check_valid;
+	/*
+	 * If the next page causes us to wrap to the beginning of the log
+	 * file then we know which page to check next.
+	 */
+	if (wrapped) {
+		page_cnt = 2;
+		page_pos = 1;
+		goto check_valid;
+	}
+
+	cur_pos = 2;
+
+next_test_page:
+	ntfs_free(tst_page);
+	tst_page = NULL;
+
+	/* Walk through the file, reading log pages */
+	err = read_log_page(log, nextpage_off, &tst_page, &usa_error);
+
+	/*
+	 * If we get a USA error then assume that we correctly found
+	 * the end of the original transfer
+	 */
+	if (usa_error)
+		goto file_is_valid;
+
+	/*
+	 * If we were able to read the page, we examine it to see if it
+	 * is the same or different Io block
+	 */
+	if (err)
+		goto next_test_page_1;
+
+	if (le16_to_cpu(tst_page->page_pos) == cur_pos &&
+	    check_subseq_log_page(log, tst_page, nextpage_off, expected_seq)) {
+		page_cnt = le16_to_cpu(tst_page->page_count) + 1;
+		page_pos = le16_to_cpu(tst_page->page_pos);
+		goto check_valid;
+	} else {
+		goto file_is_valid;
+	}
+
+next_test_page_1:
+
+	nextpage_off = next_page_off(log, curpage_off);
+	wrapped = nextpage_off == log->first_page;
+
+	if (wrapped) {
+		expected_seq += 1;
+		page_cnt = 2;
+		page_pos = 1;
+	}
+
+	cur_pos += 1;
+	part_io_count += 1;
+	if (!wrapped)
+		goto next_test_page;
+
+check_valid:
+	/* Skip over the remaining pages this transfer */
+	remain_pages = page_cnt - page_pos - 1;
+	part_io_count += remain_pages;
+
+	while (remain_pages--) {
+		nextpage_off = next_page_off(log, curpage_off);
+		wrapped = nextpage_off == log->first_page;
+
+		if (wrapped)
+			expected_seq += 1;
+	}
+
+	/* Call our routine to check this log page */
+	ntfs_free(tst_page);
+	tst_page = NULL;
+
+	err = read_log_page(log, nextpage_off, &tst_page, &usa_error);
+	if (!err && !usa_error &&
+	    check_subseq_log_page(log, tst_page, nextpage_off, expected_seq)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+file_is_valid:
+
+	/* We have a valid file */
+	if (page_off1 || tail_page) {
+		struct RECORD_PAGE_HDR *tmp_page;
+
+		if (sb_rdonly(log->ni->mi.sbi->sb)) {
+			err = -EROFS;
+			goto out;
+		}
+
+		if (page_off1) {
+			tmp_page = Add2Ptr(page_bufs, page_off1 - page_off);
+			tails -= (page_off1 - page_off) / log->page_size;
+			if (!tail_page)
+				tails -= 1;
+		} else {
+			tmp_page = tail_page;
+			tails = 1;
+		}
+
+		while (tails--) {
+			u64 off = hdr_file_off(log, tmp_page);
+
+			if (!page) {
+				page = ntfs_malloc(log->page_size);
+				if (!page)
+					return -ENOMEM;
+			}
+
+			/*
+			 * Correct page and copy the data from this page
+			 * into it and flush it to disk
+			 */
+			memcpy(page, tmp_page, log->page_size);
+
+			/* Fill last flushed lsn value flush the page */
+			if (log->major_ver < 2)
+				page->rhdr.lsn = page->record_hdr.last_end_lsn;
+			else
+				page->file_off = 0;
+
+			page->page_pos = page->page_count = cpu_to_le16(1);
+
+			ntfs_fix_pre_write(&page->rhdr, log->page_size);
+
+			err = ntfs_sb_write_run(log->ni->mi.sbi,
+						&log->ni->file.run, off, page,
+						log->page_size);
+
+			if (err)
+				goto out;
+
+			if (part_io_count && second_off == off) {
+				second_off += log->page_size;
+				part_io_count -= 1;
+			}
+
+			tmp_page = Add2Ptr(tmp_page, log->page_size);
+		}
+	}
+
+	if (part_io_count) {
+		if (sb_rdonly(log->ni->mi.sbi->sb)) {
+			err = -EROFS;
+			goto out;
+		}
+	}
+
+out:
+	ntfs_free(second_tail);
+	ntfs_free(first_tail);
+	ntfs_free(page);
+	ntfs_free(tst_page);
+	ntfs_free(page_bufs);
+
+	return err;
+}
+
+/*
+ * read_log_rec_buf
+ *
+ * copies a log record from the file to a buffer
+ * The log record may span several log pages and may even wrap the file
+ */
+static int read_log_rec_buf(struct ntfs_log *log,
+			    const struct LFS_RECORD_HDR *rh, void *buffer)
+{
+	int err;
+	struct RECORD_PAGE_HDR *ph = NULL;
+	u64 lsn = le64_to_cpu(rh->this_lsn);
+	u32 vbo = lsn_to_vbo(log, lsn) & ~log->page_mask;
+	u32 off = lsn_to_page_off(log, lsn) + log->record_header_len;
+	u32 data_len = le32_to_cpu(rh->client_data_len);
+
+	/*
+	 * While there are more bytes to transfer,
+	 * we continue to attempt to perform the read
+	 */
+	for (;;) {
+		bool usa_error;
+		u32 tail = log->page_size - off;
+
+		if (tail >= data_len)
+			tail = data_len;
+
+		data_len -= tail;
+
+		err = read_log_page(log, vbo, &ph, &usa_error);
+		if (err)
+			goto out;
+
+		/*
+		 * The last lsn on this page better be greater or equal
+		 * to the lsn we are copying
+		 */
+		if (lsn > le64_to_cpu(ph->rhdr.lsn)) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		memcpy(buffer, Add2Ptr(ph, off), tail);
+
+		/* If there are no more bytes to transfer, we exit the loop */
+		if (!data_len) {
+			if (!is_log_record_end(ph) ||
+			    lsn > le64_to_cpu(ph->record_hdr.last_end_lsn)) {
+				err = -EINVAL;
+				goto out;
+			}
+			break;
+		}
+
+		if (ph->rhdr.lsn == ph->record_hdr.last_end_lsn ||
+		    lsn > le64_to_cpu(ph->rhdr.lsn)) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		vbo = next_page_off(log, vbo);
+		off = log->data_off;
+
+		/*
+		 * adjust our pointer the user's buffer to transfer
+		 * the next block to
+		 */
+		buffer = Add2Ptr(buffer, tail);
+	}
+
+out:
+	ntfs_free(ph);
+	return err;
+}
+
+static int read_rst_area(struct ntfs_log *log, struct NTFS_RESTART **rst_,
+			 u64 *lsn)
+{
+	int err;
+	struct LFS_RECORD_HDR *rh = NULL;
+	const struct CLIENT_REC *cr =
+		Add2Ptr(log->ra, le16_to_cpu(log->ra->client_off));
+	u64 lsnr, lsnc = le64_to_cpu(cr->restart_lsn);
+	u32 len;
+	struct NTFS_RESTART *rst;
+
+	*lsn = 0;
+	*rst_ = NULL;
+
+	/* If the client doesn't have a restart area, go ahead and exit now */
+	if (!lsnc)
+		return 0;
+
+	err = read_log_page(log, lsn_to_vbo(log, lsnc),
+			    (struct RECORD_PAGE_HDR **)&rh, NULL);
+	if (err)
+		return err;
+
+	rst = NULL;
+	lsnr = le64_to_cpu(rh->this_lsn);
+
+	if (lsnc != lsnr) {
+		/* If the lsn values don't match, then the disk is corrupt */
+		err = -EINVAL;
+		goto out;
+	}
+
+	*lsn = lsnr;
+	len = le32_to_cpu(rh->client_data_len);
+
+	if (!len) {
+		err = 0;
+		goto out;
+	}
+
+	if (len < sizeof(struct NTFS_RESTART)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	rst = ntfs_malloc(len);
+	if (!rst) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	/* Copy the data into the 'rst' buffer */
+	err = read_log_rec_buf(log, rh, rst);
+	if (err)
+		goto out;
+
+	*rst_ = rst;
+	rst = NULL;
+
+out:
+	ntfs_free(rh);
+	ntfs_free(rst);
+
+	return err;
+}
+
+static int find_log_rec(struct ntfs_log *log, u64 lsn, struct lcb *lcb)
+{
+	int err;
+	struct LFS_RECORD_HDR *rh = lcb->lrh;
+	u32 rec_len, len;
+
+	/* Read the record header for this lsn */
+	if (!rh) {
+		err = read_log_page(log, lsn_to_vbo(log, lsn),
+				    (struct RECORD_PAGE_HDR **)&rh, NULL);
+
+		lcb->lrh = rh;
+		if (err)
+			return err;
+	}
+
+	/*
+	 * If the lsn the log record doesn't match the desired
+	 * lsn then the disk is corrupt
+	 */
+	if (lsn != le64_to_cpu(rh->this_lsn))
+		return -EINVAL;
+
+	len = le32_to_cpu(rh->client_data_len);
+
+	/*
+	 * check that the length field isn't greater than the total
+	 * available space the log file
+	 */
+	rec_len = len + log->record_header_len;
+	if (rec_len >= log->total_avail)
+		return -EINVAL;
+
+	/*
+	 * If the entire log record is on this log page,
+	 * put a pointer to the log record the context block
+	 */
+	if (rh->flags & LOG_RECORD_MULTI_PAGE) {
+		void *lr = ntfs_malloc(len);
+
+		if (!lr)
+			return -ENOMEM;
+
+		lcb->log_rec = lr;
+		lcb->alloc = true;
+
+		/* Copy the data into the buffer returned */
+		err = read_log_rec_buf(log, rh, lr);
+		if (err)
+			return err;
+	} else {
+		/* If beyond the end of the current page -> an error */
+		u32 page_off = lsn_to_page_off(log, lsn);
+
+		if (page_off + len + log->record_header_len > log->page_size)
+			return -EINVAL;
+
+		lcb->log_rec = Add2Ptr(rh, sizeof(struct LFS_RECORD_HDR));
+		lcb->alloc = false;
+	}
+
+	return 0;
+}
+
+/*
+ * read_log_rec_lcb
+ *
+ * initiates the query operation.
+ */
+static int read_log_rec_lcb(struct ntfs_log *log, u64 lsn, u32 ctx_mode,
+			    struct lcb **lcb_)
+{
+	int err;
+	const struct CLIENT_REC *cr;
+	struct lcb *lcb;
+
+	switch (ctx_mode) {
+	case lcb_ctx_undo_next:
+	case lcb_ctx_prev:
+	case lcb_ctx_next:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* check that the given lsn is the legal range for this client */
+	cr = Add2Ptr(log->ra, le16_to_cpu(log->ra->client_off));
+
+	if (!verify_client_lsn(log, cr, lsn))
+		return -EINVAL;
+
+	lcb = ntfs_zalloc(sizeof(struct lcb));
+	if (!lcb)
+		return -ENOMEM;
+	lcb->client = log->client_id;
+	lcb->ctx_mode = ctx_mode;
+
+	/* Find the log record indicated by the given lsn */
+	err = find_log_rec(log, lsn, lcb);
+	if (err)
+		goto out;
+
+	*lcb_ = lcb;
+	return 0;
+
+out:
+	lcb_put(lcb);
+	*lcb_ = NULL;
+	return err;
+}
+
+/*
+ * find_client_next_lsn
+ *
+ * attempt to find the next lsn to return to a client based on the context mode.
+ */
+static int find_client_next_lsn(struct ntfs_log *log, struct lcb *lcb, u64 *lsn)
+{
+	int err;
+	u64 next_lsn;
+	struct LFS_RECORD_HDR *hdr;
+
+	hdr = lcb->lrh;
+	*lsn = 0;
+
+	if (lcb_ctx_next != lcb->ctx_mode)
+		goto check_undo_next;
+
+	/* Loop as long as another lsn can be found */
+	for (;;) {
+		u64 current_lsn;
+
+		err = next_log_lsn(log, hdr, &current_lsn);
+		if (err)
+			goto out;
+
+		if (!current_lsn)
+			break;
+
+		if (hdr != lcb->lrh)
+			ntfs_free(hdr);
+
+		hdr = NULL;
+		err = read_log_page(log, lsn_to_vbo(log, current_lsn),
+				    (struct RECORD_PAGE_HDR **)&hdr, NULL);
+		if (err)
+			goto out;
+
+		if (memcmp(&hdr->client, &lcb->client,
+			   sizeof(struct CLIENT_ID))) {
+			/*err = -EINVAL; */
+		} else if (LfsClientRecord == hdr->record_type) {
+			ntfs_free(lcb->lrh);
+			lcb->lrh = hdr;
+			*lsn = current_lsn;
+			return 0;
+		}
+	}
+
+out:
+	if (hdr != lcb->lrh)
+		ntfs_free(hdr);
+	return err;
+
+check_undo_next:
+	if (lcb_ctx_undo_next == lcb->ctx_mode)
+		next_lsn = le64_to_cpu(hdr->client_undo_next_lsn);
+	else if (lcb_ctx_prev == lcb->ctx_mode)
+		next_lsn = le64_to_cpu(hdr->client_prev_lsn);
+	else
+		return 0;
+
+	if (!next_lsn)
+		return 0;
+
+	if (!verify_client_lsn(
+		    log, Add2Ptr(log->ra, le16_to_cpu(log->ra->client_off)),
+		    next_lsn))
+		return 0;
+
+	hdr = NULL;
+	err = read_log_page(log, lsn_to_vbo(log, next_lsn),
+			    (struct RECORD_PAGE_HDR **)&hdr, NULL);
+	if (err)
+		return err;
+	ntfs_free(lcb->lrh);
+	lcb->lrh = hdr;
+
+	*lsn = next_lsn;
+
+	return 0;
+}
+
+static int read_next_log_rec(struct ntfs_log *log, struct lcb *lcb, u64 *lsn)
+{
+	int err;
+
+	err = find_client_next_lsn(log, lcb, lsn);
+	if (err)
+		return err;
+
+	if (!*lsn)
+		return 0;
+
+	if (lcb->alloc)
+		ntfs_free(lcb->log_rec);
+
+	lcb->log_rec = NULL;
+	lcb->alloc = false;
+	ntfs_free(lcb->lrh);
+	lcb->lrh = NULL;
+
+	return find_log_rec(log, *lsn, lcb);
+}
+
+static inline bool check_index_header(const struct INDEX_HDR *hdr, size_t bytes)
+{
+	__le16 mask;
+	u32 min_de, de_off, used, total;
+	const struct NTFS_DE *e;
+
+	if (hdr_has_subnode(hdr)) {
+		min_de = sizeof(struct NTFS_DE) + sizeof(u64);
+		mask = NTFS_IE_HAS_SUBNODES;
+	} else {
+		min_de = sizeof(struct NTFS_DE);
+		mask = 0;
+	}
+
+	de_off = le32_to_cpu(hdr->de_off);
+	used = le32_to_cpu(hdr->used);
+	total = le32_to_cpu(hdr->total);
+
+	if (de_off > bytes - min_de || used > bytes || total > bytes ||
+	    de_off + min_de > used || used > total) {
+		return false;
+	}
+
+	e = Add2Ptr(hdr, de_off);
+	for (;;) {
+		u16 esize = le16_to_cpu(e->size);
+		struct NTFS_DE *next = Add2Ptr(e, esize);
+
+		if (esize < min_de || PtrOffset(hdr, next) > used ||
+		    (e->flags & NTFS_IE_HAS_SUBNODES) != mask) {
+			return false;
+		}
+
+		if (de_is_last(e))
+			break;
+
+		e = next;
+	}
+
+	return true;
+}
+
+static inline bool check_index_buffer(const struct INDEX_BUFFER *ib, u32 bytes)
+{
+	u16 fo;
+	const struct NTFS_RECORD_HEADER *r = &ib->rhdr;
+
+	if (r->sign != NTFS_INDX_SIGNATURE)
+		return false;
+
+	fo = (SECTOR_SIZE - ((bytes >> SECTOR_SHIFT) + 1) * sizeof(short));
+
+	if (le16_to_cpu(r->fix_off) > fo)
+		return false;
+
+	if ((le16_to_cpu(r->fix_num) - 1) * SECTOR_SIZE != bytes)
+		return false;
+
+	return check_index_header(&ib->ihdr,
+				  bytes - offsetof(struct INDEX_BUFFER, ihdr));
+}
+
+static inline bool check_index_root(const struct ATTRIB *attr,
+				    struct ntfs_sb_info *sbi)
+{
+	bool ret;
+	const struct INDEX_ROOT *root = resident_data(attr);
+	u8 index_bits =
+		le32_to_cpu(root->index_block_size) >= sbi->cluster_size ?
+			sbi->cluster_bits :
+			SECTOR_SHIFT;
+	u8 block_clst = root->index_block_clst;
+
+	if (le32_to_cpu(attr->res.data_size) < sizeof(struct INDEX_ROOT) ||
+	    (root->type != ATTR_NAME && root->type != ATTR_ZERO) ||
+	    (root->type == ATTR_NAME &&
+	     root->rule != NTFS_COLLATION_TYPE_FILENAME) ||
+	    (le32_to_cpu(root->index_block_size) !=
+	     (block_clst << index_bits)) ||
+	    (block_clst != 1 && block_clst != 2 && block_clst != 4 &&
+	     block_clst != 8 && block_clst != 0x10 && block_clst != 0x20 &&
+	     block_clst != 0x40 && block_clst != 0x80)) {
+		return false;
+	}
+
+	ret = check_index_header(&root->ihdr,
+				 le32_to_cpu(attr->res.data_size) -
+					 offsetof(struct INDEX_ROOT, ihdr));
+	return ret;
+}
+
+static inline bool check_attr(const struct MFT_REC *rec,
+			      const struct ATTRIB *attr,
+			      struct ntfs_sb_info *sbi)
+{
+	u32 asize = le32_to_cpu(attr->size);
+	u32 rsize = 0;
+	u64 dsize, svcn, evcn;
+	u16 run_off;
+
+	/* Check the fixed part of the attribute record header */
+	if (asize >= sbi->record_size ||
+	    asize + PtrOffset(rec, attr) >= sbi->record_size ||
+	    (attr->name_len &&
+	     le16_to_cpu(attr->name_off) + attr->name_len * sizeof(short) >
+		     asize)) {
+		return false;
+	}
+
+	/* Check the attribute fields */
+	switch (attr->non_res) {
+	case 0:
+		rsize = le32_to_cpu(attr->res.data_size);
+		if (rsize >= asize ||
+		    le16_to_cpu(attr->res.data_off) + rsize > asize) {
+			return false;
+		}
+		break;
+
+	case 1:
+		dsize = le64_to_cpu(attr->nres.data_size);
+		svcn = le64_to_cpu(attr->nres.svcn);
+		evcn = le64_to_cpu(attr->nres.evcn);
+		run_off = le16_to_cpu(attr->nres.run_off);
+
+		if (svcn > evcn + 1 || run_off >= asize ||
+		    le64_to_cpu(attr->nres.valid_size) > dsize ||
+		    dsize > le64_to_cpu(attr->nres.alloc_size)) {
+			return false;
+		}
+
+		if (run_unpack(NULL, sbi, 0, svcn, evcn, svcn,
+			       Add2Ptr(attr, run_off), asize - run_off) < 0) {
+			return false;
+		}
+
+		return true;
+
+	default:
+		return false;
+	}
+
+	switch (attr->type) {
+	case ATTR_NAME:
+		if (fname_full_size(Add2Ptr(
+			    attr, le16_to_cpu(attr->res.data_off))) > asize) {
+			return false;
+		}
+		break;
+
+	case ATTR_ROOT:
+		return check_index_root(attr, sbi);
+
+	case ATTR_STD:
+		if (rsize < sizeof(struct ATTR_STD_INFO5) &&
+		    rsize != sizeof(struct ATTR_STD_INFO)) {
+			return false;
+		}
+		break;
+
+	case ATTR_LIST:
+	case ATTR_ID:
+	case ATTR_SECURE:
+	case ATTR_LABEL:
+	case ATTR_VOL_INFO:
+	case ATTR_DATA:
+	case ATTR_ALLOC:
+	case ATTR_BITMAP:
+	case ATTR_REPARSE:
+	case ATTR_EA_INFO:
+	case ATTR_EA:
+	case ATTR_PROPERTYSET:
+	case ATTR_LOGGED_UTILITY_STREAM:
+		break;
+
+	default:
+		return false;
+	}
+
+	return true;
+}
+
+static inline bool check_file_record(const struct MFT_REC *rec,
+				     const struct MFT_REC *rec2,
+				     struct ntfs_sb_info *sbi)
+{
+	const struct ATTRIB *attr;
+	u16 fo = le16_to_cpu(rec->rhdr.fix_off);
+	u16 fn = le16_to_cpu(rec->rhdr.fix_num);
+	u16 ao = le16_to_cpu(rec->attr_off);
+	u32 rs = sbi->record_size;
+
+	/* check the file record header for consistency */
+	if (rec->rhdr.sign != NTFS_FILE_SIGNATURE ||
+	    fo > (SECTOR_SIZE - ((rs >> SECTOR_SHIFT) + 1) * sizeof(short)) ||
+	    (fn - 1) * SECTOR_SIZE != rs || ao < MFTRECORD_FIXUP_OFFSET_1 ||
+	    ao > sbi->record_size - SIZEOF_RESIDENT || !is_rec_inuse(rec) ||
+	    le32_to_cpu(rec->total) != rs) {
+		return false;
+	}
+
+	/* Loop to check all of the attributes */
+	for (attr = Add2Ptr(rec, ao); attr->type != ATTR_END;
+	     attr = Add2Ptr(attr, le32_to_cpu(attr->size))) {
+		if (check_attr(rec, attr, sbi))
+			continue;
+		return false;
+	}
+
+	return true;
+}
+
+static inline int check_lsn(const struct NTFS_RECORD_HEADER *hdr,
+			    const u64 *rlsn)
+{
+	u64 lsn;
+
+	if (!rlsn)
+		return true;
+
+	lsn = le64_to_cpu(hdr->lsn);
+
+	if (hdr->sign == NTFS_HOLE_SIGNATURE)
+		return false;
+
+	if (*rlsn > lsn)
+		return true;
+
+	return false;
+}
+
+static inline bool check_if_attr(const struct MFT_REC *rec,
+				 const struct LOG_REC_HDR *lrh)
+{
+	u16 ro = le16_to_cpu(lrh->record_off);
+	u16 o = le16_to_cpu(rec->attr_off);
+	const struct ATTRIB *attr = Add2Ptr(rec, o);
+
+	while (o < ro) {
+		u32 asize;
+
+		if (attr->type == ATTR_END)
+			break;
+
+		asize = le32_to_cpu(attr->size);
+		if (!asize)
+			break;
+
+		o += asize;
+		attr = Add2Ptr(attr, asize);
+	}
+
+	return o == ro;
+}
+
+static inline bool check_if_index_root(const struct MFT_REC *rec,
+				       const struct LOG_REC_HDR *lrh)
+{
+	u16 ro = le16_to_cpu(lrh->record_off);
+	u16 o = le16_to_cpu(rec->attr_off);
+	const struct ATTRIB *attr = Add2Ptr(rec, o);
+
+	while (o < ro) {
+		u32 asize;
+
+		if (attr->type == ATTR_END)
+			break;
+
+		asize = le32_to_cpu(attr->size);
+		if (!asize)
+			break;
+
+		o += asize;
+		attr = Add2Ptr(attr, asize);
+	}
+
+	return o == ro && attr->type == ATTR_ROOT;
+}
+
+static inline bool check_if_root_index(const struct ATTRIB *attr,
+				       const struct INDEX_HDR *hdr,
+				       const struct LOG_REC_HDR *lrh)
+{
+	u16 ao = le16_to_cpu(lrh->attr_off);
+	u32 de_off = le32_to_cpu(hdr->de_off);
+	u32 o = PtrOffset(attr, hdr) + de_off;
+	const struct NTFS_DE *e = Add2Ptr(hdr, de_off);
+	u32 asize = le32_to_cpu(attr->size);
+
+	while (o < ao) {
+		u16 esize;
+
+		if (o >= asize)
+			break;
+
+		esize = le16_to_cpu(e->size);
+		if (!esize)
+			break;
+
+		o += esize;
+		e = Add2Ptr(e, esize);
+	}
+
+	return o == ao;
+}
+
+static inline bool check_if_alloc_index(const struct INDEX_HDR *hdr,
+					u32 attr_off)
+{
+	u32 de_off = le32_to_cpu(hdr->de_off);
+	u32 o = offsetof(struct INDEX_BUFFER, ihdr) + de_off;
+	const struct NTFS_DE *e = Add2Ptr(hdr, de_off);
+	u32 used = le32_to_cpu(hdr->used);
+
+	while (o < attr_off) {
+		u16 esize;
+
+		if (de_off >= used)
+			break;
+
+		esize = le16_to_cpu(e->size);
+		if (!esize)
+			break;
+
+		o += esize;
+		de_off += esize;
+		e = Add2Ptr(e, esize);
+	}
+
+	return o == attr_off;
+}
+
+static inline void change_attr_size(struct MFT_REC *rec, struct ATTRIB *attr,
+				    u32 nsize)
+{
+	u32 asize = le32_to_cpu(attr->size);
+	int dsize = nsize - asize;
+	u8 *next = Add2Ptr(attr, asize);
+	u32 used = le32_to_cpu(rec->used);
+
+	memmove(Add2Ptr(attr, nsize), next, used - PtrOffset(rec, next));
+
+	rec->used = cpu_to_le32(used + dsize);
+	attr->size = cpu_to_le32(nsize);
+}
+
+struct OpenAttr {
+	struct ATTRIB *attr;
+	struct runs_tree *run1;
+	struct runs_tree run0;
+	struct ntfs_inode *ni;
+	// CLST rno;
+};
+
+/* Returns 0 if 'attr' has the same type and name */
+static inline int cmp_type_and_name(const struct ATTRIB *a1,
+				    const struct ATTRIB *a2)
+{
+	return a1->type != a2->type || a1->name_len != a2->name_len ||
+	       (a1->name_len && memcmp(attr_name(a1), attr_name(a2),
+				       a1->name_len * sizeof(short)));
+}
+
+static struct OpenAttr *find_loaded_attr(struct ntfs_log *log,
+					 const struct ATTRIB *attr, CLST rno)
+{
+	struct OPEN_ATTR_ENRTY *oe = NULL;
+
+	while ((oe = enum_rstbl(log->open_attr_tbl, oe))) {
+		struct OpenAttr *op_attr;
+
+		if (ino_get(&oe->ref) != rno)
+			continue;
+
+		op_attr = (struct OpenAttr *)oe->ptr;
+		if (!cmp_type_and_name(op_attr->attr, attr))
+			return op_attr;
+	}
+	return NULL;
+}
+
+static struct ATTRIB *attr_create_nonres_log(struct ntfs_sb_info *sbi,
+					     enum ATTR_TYPE type, u64 size,
+					     const u16 *name, size_t name_len,
+					     __le16 flags)
+{
+	struct ATTRIB *attr;
+	u32 name_size = QuadAlign(name_len * sizeof(short));
+	bool is_ext = flags & (ATTR_FLAG_COMPRESSED | ATTR_FLAG_SPARSED);
+	u32 asize = name_size +
+		    (is_ext ? SIZEOF_NONRESIDENT_EX : SIZEOF_NONRESIDENT);
+
+	attr = ntfs_zalloc(asize);
+	if (!attr)
+		return NULL;
+
+	attr->type = type;
+	attr->size = cpu_to_le32(asize);
+	attr->flags = flags;
+	attr->non_res = 1;
+	attr->name_len = name_len;
+
+	attr->nres.evcn = cpu_to_le64((u64)bytes_to_cluster(sbi, size) - 1);
+	attr->nres.alloc_size = cpu_to_le64(ntfs_up_cluster(sbi, size));
+	attr->nres.data_size = cpu_to_le64(size);
+	attr->nres.valid_size = attr->nres.data_size;
+	if (is_ext) {
+		attr->name_off = SIZEOF_NONRESIDENT_EX_LE;
+		if (is_attr_compressed(attr))
+			attr->nres.c_unit = COMPRESSION_UNIT;
+
+		attr->nres.run_off =
+			cpu_to_le16(SIZEOF_NONRESIDENT_EX + name_size);
+		memcpy(Add2Ptr(attr, SIZEOF_NONRESIDENT_EX), name,
+		       name_len * sizeof(short));
+	} else {
+		attr->name_off = SIZEOF_NONRESIDENT_LE;
+		attr->nres.run_off =
+			cpu_to_le16(SIZEOF_NONRESIDENT + name_size);
+		memcpy(Add2Ptr(attr, SIZEOF_NONRESIDENT), name,
+		       name_len * sizeof(short));
+	}
+
+	return attr;
+}
+
+/*
+ * do_action
+ *
+ * common routine for the Redo and Undo Passes
+ * If rlsn is NULL then undo
+ */
+static int do_action(struct ntfs_log *log, struct OPEN_ATTR_ENRTY *oe,
+		     const struct LOG_REC_HDR *lrh, u32 op, void *data,
+		     u32 dlen, u32 rec_len, const u64 *rlsn)
+{
+	int err = 0;
+	struct ntfs_sb_info *sbi = log->ni->mi.sbi;
+	struct inode *inode = NULL, *inode_parent;
+	struct mft_inode *mi = NULL, *mi2_child = NULL;
+	CLST rno = 0, rno_base = 0;
+	struct INDEX_BUFFER *ib = NULL;
+	struct MFT_REC *rec = NULL;
+	struct ATTRIB *attr = NULL, *attr2;
+	struct INDEX_HDR *hdr;
+	struct INDEX_ROOT *root;
+	struct NTFS_DE *e, *e1, *e2;
+	struct NEW_ATTRIBUTE_SIZES *new_sz;
+	struct ATTR_FILE_NAME *fname;
+	struct OpenAttr *oa, *oa2;
+	u32 nsize, t32, asize, used, esize, bmp_off, bmp_bits;
+	u16 t16, id, id2;
+	u32 record_size = sbi->record_size;
+	u64 t64;
+	u64 lco = 0;
+	u64 cbo = (u64)le16_to_cpu(lrh->cluster_off) << SECTOR_SHIFT;
+	u64 tvo = le64_to_cpu(lrh->target_vcn) << sbi->cluster_bits;
+	u64 vbo = cbo + tvo;
+	void *buffer_le = NULL;
+	u32 bytes = 0;
+	bool a_dirty = false;
+	u16 data_off;
+
+	oa = oe->ptr;
+
+	/* Big switch to prepare */
+	switch (op) {
+	/* ============================================================
+	 * Process MFT records, as described by the current log record
+	 * ============================================================
+	 */
+	case InitializeFileRecordSegment:
+	case DeallocateFileRecordSegment:
+	case WriteEndOfFileRecordSegment:
+	case CreateAttribute:
+	case DeleteAttribute:
+	case UpdateResidentValue:
+	case UpdateMappingPairs:
+	case SetNewAttributeSizes:
+	case AddIndexEntryRoot:
+	case DeleteIndexEntryRoot:
+	case SetIndexEntryVcnRoot:
+	case UpdateFileNameRoot:
+	case UpdateRecordDataRoot:
+	case ZeroEndOfFileRecord:
+
+		rno = vbo >> sbi->record_bits;
+		inode = ilookup(sbi->sb, rno);
+		if (inode) {
+			mi = &ntfs_i(inode)->mi;
+		} else if (op == InitializeFileRecordSegment) {
+			mi = ntfs_zalloc(sizeof(struct mft_inode));
+			if (!mi)
+				return -ENOMEM;
+			err = mi_format_new(mi, sbi, rno, 0, false);
+			if (err)
+				goto out;
+		} else {
+			/* read from disk */
+			err = mi_get(sbi, rno, &mi);
+			if (err)
+				return err;
+		}
+		rec = mi->mrec;
+
+		if (op == DeallocateFileRecordSegment)
+			goto skip_load_parent;
+
+		if (InitializeFileRecordSegment != op) {
+			if (rec->rhdr.sign == NTFS_BAAD_SIGNATURE)
+				goto dirty_vol;
+			if (!check_lsn(&rec->rhdr, rlsn))
+				goto out;
+			if (!check_file_record(rec, NULL, sbi))
+				goto dirty_vol;
+			attr = Add2Ptr(rec, le16_to_cpu(lrh->record_off));
+		}
+
+		if (is_rec_base(rec) || InitializeFileRecordSegment == op) {
+			rno_base = rno;
+			goto skip_load_parent;
+		}
+
+		rno_base = ino_get(&rec->parent_ref);
+		inode_parent = ntfs_iget5(sbi->sb, &rec->parent_ref, NULL);
+		if (IS_ERR(inode_parent))
+			goto skip_load_parent;
+
+		if (is_bad_inode(inode_parent)) {
+			iput(inode_parent);
+			goto skip_load_parent;
+		}
+
+		if (ni_load_mi_ex(ntfs_i(inode_parent), rno, &mi2_child)) {
+			iput(inode_parent);
+		} else {
+			if (mi2_child->mrec != mi->mrec)
+				memcpy(mi2_child->mrec, mi->mrec,
+				       sbi->record_size);
+
+			if (inode)
+				iput(inode);
+			else if (mi)
+				mi_put(mi);
+
+			inode = inode_parent;
+			mi = mi2_child;
+			rec = mi2_child->mrec;
+			attr = Add2Ptr(rec, le16_to_cpu(lrh->record_off));
+		}
+
+skip_load_parent:
+		inode_parent = NULL;
+		break;
+
+	/* ============================================================
+	 * Process attributes, as described by the current log record
+	 * ============================================================
+	 */
+	case UpdateNonresidentValue:
+	case AddIndexEntryAllocation:
+	case DeleteIndexEntryAllocation:
+	case WriteEndOfIndexBuffer:
+	case SetIndexEntryVcnAllocation:
+	case UpdateFileNameAllocation:
+	case SetBitsInNonresidentBitMap:
+	case ClearBitsInNonresidentBitMap:
+	case UpdateRecordDataAllocation:
+
+		attr = oa->attr;
+		bytes = UpdateNonresidentValue == op ? dlen : 0;
+		lco = (u64)le16_to_cpu(lrh->lcns_follow) << sbi->cluster_bits;
+
+		if (attr->type == ATTR_ALLOC) {
+			t32 = le32_to_cpu(oe->bytes_per_index);
+			if (bytes < t32)
+				bytes = t32;
+		}
+
+		if (!bytes)
+			bytes = lco - cbo;
+
+		bytes += le16_to_cpu(lrh->record_off);
+		if (attr->type == ATTR_ALLOC)
+			bytes = (bytes + 511) & ~511; // align
+
+		buffer_le = ntfs_malloc(bytes);
+		if (!buffer_le)
+			return -ENOMEM;
+
+		err = ntfs_read_run_nb(sbi, oa->run1, vbo, buffer_le, bytes,
+				       NULL);
+		if (err)
+			goto out;
+
+		if (attr->type == ATTR_ALLOC && *(int *)buffer_le)
+			ntfs_fix_post_read(buffer_le, bytes, false);
+		break;
+
+	default:
+		WARN_ON(1);
+	}
+
+	/* Big switch to do operation */
+	switch (op) {
+	case InitializeFileRecordSegment:
+		t16 = le16_to_cpu(lrh->record_off);
+		if (t16 + dlen > record_size)
+			goto dirty_vol;
+
+		memcpy(Add2Ptr(rec, t16), data, dlen);
+		mi->dirty = true;
+		break;
+
+	case DeallocateFileRecordSegment:
+		clear_rec_inuse(rec);
+		le16_add_cpu(&rec->seq, 1);
+		mi->dirty = true;
+		break;
+
+	case WriteEndOfFileRecordSegment:
+		attr2 = (struct ATTRIB *)data;
+		t16 = le16_to_cpu(lrh->record_off);
+
+		if (!check_if_attr(rec, lrh) || t16 + dlen > record_size)
+			goto dirty_vol;
+
+		memmove(attr, attr2, dlen);
+		rec->used = cpu_to_le32(QuadAlign(t16 + dlen));
+
+		mi->dirty = true;
+		break;
+
+	case CreateAttribute:
+		attr2 = (struct ATTRIB *)data;
+		asize = le32_to_cpu(attr2->size);
+		used = le32_to_cpu(rec->used);
+		t16 = le16_to_cpu(lrh->record_off);
+
+		if (!check_if_attr(rec, lrh) || dlen < SIZEOF_RESIDENT ||
+		    !IsQuadAligned(asize) ||
+		    Add2Ptr(attr2, asize) > Add2Ptr(lrh, rec_len) ||
+		    dlen > record_size - used) {
+			goto dirty_vol;
+		}
+
+		memmove(Add2Ptr(attr, asize), attr, used - t16);
+		memcpy(attr, attr2, asize);
+
+		rec->used = cpu_to_le32(used + asize);
+		id = le16_to_cpu(rec->next_attr_id);
+		id2 = le16_to_cpu(attr2->id);
+		if (id <= id2)
+			rec->next_attr_id = cpu_to_le16(id2 + 1);
+		if (is_attr_indexed(attr))
+			le16_add_cpu(&rec->hard_links, 1);
+
+		oa2 = find_loaded_attr(log, attr, rno_base);
+		if (oa2) {
+			void *p2 = ntfs_memdup(attr, le32_to_cpu(attr->size));
+
+			if (p2) {
+				// run_close(oa2->run1);
+				ntfs_free(oa2->attr);
+				oa2->attr = p2;
+			}
+		}
+
+		mi->dirty = true;
+		break;
+
+	case DeleteAttribute:
+		asize = le32_to_cpu(attr->size);
+		used = le32_to_cpu(rec->used);
+		t16 = le16_to_cpu(lrh->record_off);
+
+		if (!check_if_attr(rec, lrh))
+			goto dirty_vol;
+
+		rec->used = cpu_to_le32(used - asize);
+		if (is_attr_indexed(attr))
+			le16_add_cpu(&rec->hard_links, -1);
+
+		memmove(attr, Add2Ptr(attr, asize), used - t16);
+
+		mi->dirty = true;
+		break;
+
+	case UpdateResidentValue:
+		t16 = le16_to_cpu(lrh->attr_off);
+		nsize = t16 + dlen;
+
+		if (!check_if_attr(rec, lrh))
+			goto dirty_vol;
+
+		asize = le32_to_cpu(attr->size);
+		used = le32_to_cpu(rec->used);
+
+		if (lrh->redo_len == lrh->undo_len) {
+			if (nsize > asize)
+				goto dirty_vol;
+			goto move_data;
+		}
+
+		if (nsize > asize && nsize - asize > record_size - used)
+			goto dirty_vol;
+
+		nsize = QuadAlign(nsize);
+		data_off = le16_to_cpu(attr->res.data_off);
+
+		if (nsize < asize) {
+			memmove(Add2Ptr(attr, t16), data, dlen);
+			data = NULL; // To skip below memmove
+		}
+
+		memmove(Add2Ptr(attr, nsize), Add2Ptr(attr, asize),
+			used - le16_to_cpu(lrh->record_off) - asize);
+
+		rec->used = cpu_to_le32(used + nsize - asize);
+		attr->size = cpu_to_le32(nsize);
+		attr->res.data_size = cpu_to_le32(t16 + dlen - data_off);
+
+move_data:
+		if (data)
+			memmove(Add2Ptr(attr, t16), data, dlen);
+
+		oa2 = find_loaded_attr(log, attr, rno_base);
+		if (oa2) {
+			void *p2 = ntfs_memdup(attr, le32_to_cpu(attr->size));
+
+			if (p2) {
+				// run_close(&oa2->run0);
+				oa2->run1 = &oa2->run0;
+				ntfs_free(oa2->attr);
+				oa2->attr = p2;
+			}
+		}
+
+		mi->dirty = true;
+		break;
+
+	case UpdateMappingPairs:
+		t16 = le16_to_cpu(lrh->attr_off);
+		nsize = t16 + dlen;
+		asize = le32_to_cpu(attr->size);
+		used = le32_to_cpu(rec->used);
+
+		if (!check_if_attr(rec, lrh) || !attr->non_res ||
+		    t16 < le16_to_cpu(attr->nres.run_off) || t16 > asize ||
+		    (nsize > asize && nsize - asize > record_size - used)) {
+			goto dirty_vol;
+		}
+
+		nsize = QuadAlign(nsize);
+
+		memmove(Add2Ptr(attr, nsize), Add2Ptr(attr, asize),
+			used - le16_to_cpu(lrh->record_off) - asize);
+		rec->used = cpu_to_le32(used + nsize - asize);
+		attr->size = cpu_to_le32(nsize);
+		memmove(Add2Ptr(attr, t16), data, dlen);
+
+		if (run_get_highest_vcn(le64_to_cpu(attr->nres.svcn),
+					attr_run(attr), &t64)) {
+			goto dirty_vol;
+		}
+
+		attr->nres.evcn = cpu_to_le64(t64);
+		oa2 = find_loaded_attr(log, attr, rno_base);
+		if (oa2 && oa2->attr->non_res)
+			oa2->attr->nres.evcn = attr->nres.evcn;
+
+		mi->dirty = true;
+		break;
+
+	case SetNewAttributeSizes:
+		new_sz = data;
+
+		if (!check_if_attr(rec, lrh) || !attr->non_res)
+			goto dirty_vol;
+
+		attr->nres.alloc_size = new_sz->alloc_size;
+		attr->nres.data_size = new_sz->data_size;
+		attr->nres.valid_size = new_sz->valid_size;
+
+		if (dlen >= sizeof(struct NEW_ATTRIBUTE_SIZES))
+			attr->nres.total_size = new_sz->total_size;
+
+		oa2 = find_loaded_attr(log, attr, rno_base);
+		if (oa2) {
+			void *p2 = ntfs_memdup(attr, le32_to_cpu(attr->size));
+
+			if (p2) {
+				ntfs_free(oa2->attr);
+				oa2->attr = p2;
+			}
+		}
+		mi->dirty = true;
+		break;
+
+	case AddIndexEntryRoot:
+		e = (struct NTFS_DE *)data;
+		esize = le16_to_cpu(e->size);
+		root = resident_data(attr);
+		hdr = &root->ihdr;
+		used = le32_to_cpu(hdr->used);
+
+		if (!check_if_index_root(rec, lrh) ||
+		    !check_if_root_index(attr, hdr, lrh) ||
+		    Add2Ptr(data, esize) > Add2Ptr(lrh, rec_len) ||
+		    esize > le32_to_cpu(rec->total) - le32_to_cpu(rec->used)) {
+			goto dirty_vol;
+		}
+
+		e1 = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+
+		change_attr_size(rec, attr, le32_to_cpu(attr->size) + esize);
+
+		memmove(Add2Ptr(e1, esize), e1,
+			PtrOffset(e1, Add2Ptr(hdr, used)));
+		memmove(e1, e, esize);
+
+		le32_add_cpu(&attr->res.data_size, esize);
+		hdr->used = cpu_to_le32(used + esize);
+		le32_add_cpu(&hdr->total, esize);
+
+		mi->dirty = true;
+		break;
+
+	case DeleteIndexEntryRoot:
+		root = resident_data(attr);
+		hdr = &root->ihdr;
+		used = le32_to_cpu(hdr->used);
+
+		if (!check_if_index_root(rec, lrh) ||
+		    !check_if_root_index(attr, hdr, lrh)) {
+			goto dirty_vol;
+		}
+
+		e1 = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+		esize = le16_to_cpu(e1->size);
+		e2 = Add2Ptr(e1, esize);
+
+		memmove(e1, e2, PtrOffset(e2, Add2Ptr(hdr, used)));
+
+		le32_sub_cpu(&attr->res.data_size, esize);
+		hdr->used = cpu_to_le32(used - esize);
+		le32_sub_cpu(&hdr->total, esize);
+
+		change_attr_size(rec, attr, le32_to_cpu(attr->size) - esize);
+
+		mi->dirty = true;
+		break;
+
+	case SetIndexEntryVcnRoot:
+		root = resident_data(attr);
+		hdr = &root->ihdr;
+
+		if (!check_if_index_root(rec, lrh) ||
+		    !check_if_root_index(attr, hdr, lrh)) {
+			goto dirty_vol;
+		}
+
+		e = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+
+		de_set_vbn_le(e, *(__le64 *)data);
+		mi->dirty = true;
+		break;
+
+	case UpdateFileNameRoot:
+		root = resident_data(attr);
+		hdr = &root->ihdr;
+
+		if (!check_if_index_root(rec, lrh) ||
+		    !check_if_root_index(attr, hdr, lrh)) {
+			goto dirty_vol;
+		}
+
+		e = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+		fname = (struct ATTR_FILE_NAME *)(e + 1);
+		memmove(&fname->dup, data, sizeof(fname->dup)); //
+		mi->dirty = true;
+		break;
+
+	case UpdateRecordDataRoot:
+		root = resident_data(attr);
+		hdr = &root->ihdr;
+
+		if (!check_if_index_root(rec, lrh) ||
+		    !check_if_root_index(attr, hdr, lrh)) {
+			goto dirty_vol;
+		}
+
+		e = Add2Ptr(attr, le16_to_cpu(lrh->attr_off));
+
+		memmove(Add2Ptr(e, le16_to_cpu(e->view.data_off)), data, dlen);
+
+		mi->dirty = true;
+		break;
+
+	case ZeroEndOfFileRecord:
+		t16 = le16_to_cpu(lrh->record_off);
+		if (t16 + dlen > record_size)
+			goto dirty_vol;
+
+		memset(attr, 0, dlen);
+		mi->dirty = true;
+		break;
+
+	case UpdateNonresidentValue:
+		t16 = le16_to_cpu(lrh->record_off);
+
+		if (lco < cbo + t16 + dlen)
+			goto dirty_vol;
+
+		memcpy(Add2Ptr(buffer_le, t16), data, dlen);
+
+		a_dirty = true;
+		if (attr->type == ATTR_ALLOC)
+			ntfs_fix_pre_write(buffer_le, bytes);
+		break;
+
+	case AddIndexEntryAllocation:
+		t16 = le16_to_cpu(lrh->record_off);
+		ib = Add2Ptr(buffer_le, t16);
+		hdr = &ib->ihdr;
+		e = data;
+		esize = le16_to_cpu(e->size);
+		t16 = le16_to_cpu(lrh->attr_off);
+		e1 = Add2Ptr(ib, t16);
+
+		if (is_baad(&ib->rhdr))
+			goto dirty_vol;
+		if (!check_lsn(&ib->rhdr, rlsn))
+			goto out;
+
+		used = le32_to_cpu(hdr->used);
+
+		if (!check_index_buffer(ib, bytes) ||
+		    !check_if_alloc_index(hdr, t16) ||
+		    Add2Ptr(e, esize) > Add2Ptr(lrh, rec_len) ||
+		    used + esize > le32_to_cpu(hdr->total)) {
+			goto dirty_vol;
+		}
+
+		memmove(Add2Ptr(e1, esize), e1,
+			PtrOffset(e1, Add2Ptr(hdr, used)));
+		memcpy(e1, e, esize);
+
+		hdr->used = cpu_to_le32(used + esize);
+
+		a_dirty = true;
+
+		ntfs_fix_pre_write(&ib->rhdr, bytes);
+		break;
+
+	case DeleteIndexEntryAllocation:
+		t16 = le16_to_cpu(lrh->record_off);
+		ib = Add2Ptr(buffer_le, t16);
+		hdr = &ib->ihdr;
+		t16 = le16_to_cpu(lrh->attr_off);
+		e = Add2Ptr(ib, t16);
+		esize = le16_to_cpu(e->size);
+
+		if (is_baad(&ib->rhdr))
+			goto dirty_vol;
+		if (!check_lsn(&ib->rhdr, rlsn))
+			goto out;
+
+		if (!check_index_buffer(ib, bytes) ||
+		    !check_if_alloc_index(hdr, t16)) {
+			goto dirty_vol;
+		}
+
+		e1 = Add2Ptr(e, esize);
+		nsize = esize;
+		used = le32_to_cpu(hdr->used);
+
+		memmove(e, e1, PtrOffset(e1, Add2Ptr(hdr, used)));
+
+		hdr->used = cpu_to_le32(used - nsize);
+
+		a_dirty = true;
+
+		ntfs_fix_pre_write(&ib->rhdr, bytes);
+		break;
+
+	case WriteEndOfIndexBuffer:
+		t16 = le16_to_cpu(lrh->record_off);
+		ib = Add2Ptr(buffer_le, t16);
+		hdr = &ib->ihdr;
+		t16 = le16_to_cpu(lrh->attr_off);
+		e = Add2Ptr(ib, t16);
+
+		if (is_baad(&ib->rhdr))
+			goto dirty_vol;
+		if (!check_lsn(&ib->rhdr, rlsn))
+			goto out;
+		if (!check_index_buffer(ib, bytes) ||
+		    !check_if_alloc_index(hdr, t16) ||
+		    t16 + dlen > offsetof(struct INDEX_BUFFER, ihdr) +
+					 le32_to_cpu(hdr->total)) {
+			goto dirty_vol;
+		}
+
+		hdr->used = cpu_to_le32(dlen + PtrOffset(hdr, e));
+		memmove(e, data, dlen);
+
+		a_dirty = true;
+		ntfs_fix_pre_write(&ib->rhdr, bytes);
+		break;
+
+	case SetIndexEntryVcnAllocation:
+		t16 = le16_to_cpu(lrh->record_off);
+		ib = Add2Ptr(buffer_le, t16);
+		hdr = &ib->ihdr;
+		t16 = le16_to_cpu(lrh->attr_off);
+		e = Add2Ptr(ib, t16);
+
+		if (is_baad(&ib->rhdr))
+			goto dirty_vol;
+
+		if (!check_lsn(&ib->rhdr, rlsn))
+			goto out;
+		if (!check_index_buffer(ib, bytes) ||
+		    !check_if_alloc_index(hdr, t16)) {
+			goto dirty_vol;
+		}
+
+		de_set_vbn_le(e, *(__le64 *)data);
+
+		a_dirty = true;
+		ntfs_fix_pre_write(&ib->rhdr, bytes);
+		break;
+
+	case UpdateFileNameAllocation:
+		t16 = le16_to_cpu(lrh->record_off);
+		ib = Add2Ptr(buffer_le, t16);
+		hdr = &ib->ihdr;
+		t16 = le16_to_cpu(lrh->attr_off);
+		e = Add2Ptr(ib, t16);
+
+		if (is_baad(&ib->rhdr))
+			goto dirty_vol;
+
+		if (!check_lsn(&ib->rhdr, rlsn))
+			goto out;
+		if (!check_index_buffer(ib, bytes) ||
+		    !check_if_alloc_index(hdr, t16)) {
+			goto dirty_vol;
+		}
+
+		fname = (struct ATTR_FILE_NAME *)(e + 1);
+		memmove(&fname->dup, data, sizeof(fname->dup));
+
+		a_dirty = true;
+		ntfs_fix_pre_write(&ib->rhdr, bytes);
+		break;
+
+	case SetBitsInNonresidentBitMap:
+		bmp_off =
+			le32_to_cpu(((struct BITMAP_RANGE *)data)->bitmap_off);
+		bmp_bits = le32_to_cpu(((struct BITMAP_RANGE *)data)->bits);
+		t16 = le16_to_cpu(lrh->record_off);
+
+		if (cbo + (bmp_off + 7) / 8 > lco ||
+		    cbo + ((bmp_off + bmp_bits + 7) / 8) > lco) {
+			goto dirty_vol;
+		}
+
+		__bitmap_set(Add2Ptr(buffer_le, t16), bmp_off, bmp_bits);
+		a_dirty = true;
+		break;
+
+	case ClearBitsInNonresidentBitMap:
+		bmp_off =
+			le32_to_cpu(((struct BITMAP_RANGE *)data)->bitmap_off);
+		bmp_bits = le32_to_cpu(((struct BITMAP_RANGE *)data)->bits);
+		t16 = le16_to_cpu(lrh->record_off);
+
+		if (cbo + (bmp_off + 7) / 8 > lco ||
+		    cbo + ((bmp_off + bmp_bits + 7) / 8) > lco) {
+			goto dirty_vol;
+		}
+
+		__bitmap_clear(Add2Ptr(buffer_le, t16), bmp_off, bmp_bits);
+		a_dirty = true;
+		break;
+
+	case UpdateRecordDataAllocation:
+		t16 = le16_to_cpu(lrh->record_off);
+		ib = Add2Ptr(buffer_le, t16);
+		hdr = &ib->ihdr;
+		t16 = le16_to_cpu(lrh->attr_off);
+		e = Add2Ptr(ib, t16);
+
+		if (is_baad(&ib->rhdr))
+			goto dirty_vol;
+
+		if (!check_lsn(&ib->rhdr, rlsn))
+			goto out;
+		if (!check_index_buffer(ib, bytes) ||
+		    !check_if_alloc_index(hdr, t16)) {
+			goto dirty_vol;
+		}
+
+		memmove(Add2Ptr(e, le16_to_cpu(e->view.data_off)), data, dlen);
+
+		a_dirty = true;
+		ntfs_fix_pre_write(&ib->rhdr, bytes);
+		break;
+
+	default:
+		WARN_ON(1);
+	}
+
+	if (rlsn) {
+		__le64 t64 = cpu_to_le64(*rlsn);
+
+		if (rec)
+			rec->rhdr.lsn = t64;
+		if (ib)
+			ib->rhdr.lsn = t64;
+	}
+
+	if (inode) {
+		err = _ni_write_inode(inode, 0);
+	} else if (mi && mi->dirty) {
+		err = mi_write(mi, 0);
+		if (err)
+			goto out;
+	}
+
+	if (a_dirty) {
+		attr = oa->attr;
+		err = ntfs_sb_write_run(sbi, oa->run1, vbo, buffer_le, bytes);
+		if (err)
+			goto out;
+	}
+
+out:
+
+	if (inode)
+		iput(inode);
+	else if (mi != mi2_child)
+		mi_put(mi);
+
+	ntfs_free(buffer_le);
+
+	return err;
+
+dirty_vol:
+	log->set_dirty = true;
+	goto out;
+}
+
+/*
+ * log_replay
+ *
+ * this function is called during mount operation
+ * it replays log and empties it
+ */
+int log_replay(struct ntfs_inode *ni)
+{
+	int err;
+	struct ntfs_sb_info *sbi = ni->mi.sbi;
+	struct ntfs_log *log;
+
+	struct restart_info rst_info, rst_info2;
+	u64 rec_lsn, ra_lsn, checkpt_lsn = 0, rlsn = 0;
+	struct ATTR_NAME_ENTRY *attr_names = NULL;
+	struct ATTR_NAME_ENTRY *ane;
+	struct RESTART_TABLE *dptbl = NULL;
+	struct RESTART_TABLE *trtbl = NULL;
+	const struct RESTART_TABLE *rt;
+	struct RESTART_TABLE *oatbl = NULL;
+	struct inode *inode;
+	struct OpenAttr *oa;
+	struct ntfs_inode *ni_oe;
+	struct ATTRIB *attr = NULL;
+	u64 size, vcn, undo_next_lsn;
+	CLST rno, lcn, lcn0, len0, clen;
+	void *data;
+	struct NTFS_RESTART *rst = NULL;
+	struct lcb *lcb = NULL;
+	struct OPEN_ATTR_ENRTY *oe;
+	struct TRANSACTION_ENTRY *tr;
+	struct DIR_PAGE_ENTRY *dp;
+	u32 i, bytes_per_attr_entry;
+	u32 l_size = ni->vfs_inode.i_size;
+	u32 orig_file_size = l_size;
+	u32 page_size, vbo, tail, off, dlen;
+	u32 saved_len, rec_len, transact_id;
+	bool use_second_page;
+	struct RESTART_AREA *ra2, *ra = NULL;
+	struct CLIENT_REC *ca, *cr;
+	__le16 client;
+	struct RESTART_HDR *rh;
+	const struct LFS_RECORD_HDR *frh;
+	const struct LOG_REC_HDR *lrh;
+	bool is_mapped;
+	bool is_ro = sb_rdonly(sbi->sb);
+	u64 t64;
+	u16 t16;
+	u32 t32;
+
+	/* Get the size of page. NOTE: To replay we can use default page */
+	page_size = norm_file_page(PAGE_SIZE, &l_size,
+				   PAGE_SIZE >= DefaultLogPageSize &&
+					   PAGE_SIZE <= DefaultLogPageSize * 2);
+	if (!page_size)
+		return -EINVAL;
+
+	log = ntfs_zalloc(sizeof(struct ntfs_log));
+	if (!log)
+		return -ENOMEM;
+
+	log->ni = ni;
+	log->l_size = l_size;
+	log->one_page_buf = ntfs_malloc(page_size);
+
+	if (!log->one_page_buf) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	log->page_size = page_size;
+	log->page_mask = page_size - 1;
+	log->page_bits = blksize_bits(page_size);
+
+	/* Look for a restart area on the disk */
+	err = log_read_rst(log, l_size, true, &rst_info);
+	if (err)
+		goto out;
+
+	if (!rst_info.restart) {
+		if (rst_info.initialized) {
+			/* no restart area but the file is not initialized */
+			err = -EINVAL;
+			goto out;
+		}
+
+		log_init_pg_hdr(log, page_size, page_size, 1, 1);
+		log_create(log, l_size, 0, get_random_int(), false, false);
+
+		log->ra = ra;
+
+		ra = log_create_ra(log);
+		if (!ra) {
+			err = -ENOMEM;
+			goto out;
+		}
+		log->ra = ra;
+		log->init_ra = true;
+
+		goto process_log;
+	}
+
+	/*
+	 * If the restart offset above wasn't zero then we won't
+	 * look for a second restart
+	 */
+	if (rst_info.vbo)
+		goto check_restart_area;
+
+	err = log_read_rst(log, l_size, false, &rst_info2);
+
+	/* Determine which restart area to use */
+	if (!rst_info2.restart || rst_info2.last_lsn <= rst_info.last_lsn)
+		goto use_first_page;
+
+	use_second_page = true;
+
+	if (rst_info.chkdsk_was_run && page_size != rst_info.vbo) {
+		struct RECORD_PAGE_HDR *sp = NULL;
+		bool usa_error;
+
+		if (!read_log_page(log, page_size, &sp, &usa_error) &&
+		    sp->rhdr.sign == NTFS_CHKD_SIGNATURE) {
+			use_second_page = false;
+		}
+		ntfs_free(sp);
+	}
+
+	if (use_second_page) {
+		ntfs_free(rst_info.r_page);
+		memcpy(&rst_info, &rst_info2, sizeof(struct restart_info));
+		rst_info2.r_page = NULL;
+	}
+
+use_first_page:
+	ntfs_free(rst_info2.r_page);
+
+check_restart_area:
+	/* If the restart area is at offset 0, we want to write the second restart area first */
+	log->init_ra = !!rst_info.vbo;
+
+	/* If we have a valid page then grab a pointer to the restart area */
+	ra2 = rst_info.valid_page ?
+		      Add2Ptr(rst_info.r_page,
+			      le16_to_cpu(rst_info.r_page->ra_off)) :
+		      NULL;
+
+	if (rst_info.chkdsk_was_run ||
+	    (ra2 && ra2->client_idx[1] == LFS_NO_CLIENT_LE)) {
+		bool wrapped = false;
+		bool use_multi_page = false;
+		u32 open_log_count;
+
+		/* Do some checks based on whether we have a valid log page */
+		if (!rst_info.valid_page) {
+			open_log_count = get_random_int();
+			goto init_log_instance;
+		}
+		open_log_count = le32_to_cpu(ra2->open_log_count);
+
+		/*
+		 * If the restart page size isn't changing then we want to
+		 * check how much work we need to do
+		 */
+		if (page_size != le32_to_cpu(rst_info.r_page->sys_page_size))
+			goto init_log_instance;
+
+init_log_instance:
+		log_init_pg_hdr(log, page_size, page_size, 1, 1);
+
+		log_create(log, l_size, rst_info.last_lsn, open_log_count,
+			   wrapped, use_multi_page);
+
+		ra = log_create_ra(log);
+		if (!ra) {
+			err = -ENOMEM;
+			goto out;
+		}
+		log->ra = ra;
+
+		/* Put the restart areas and initialize the log file as required */
+		goto process_log;
+	}
+
+	if (!ra2) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/*
+	 * If the log page or the system page sizes have changed, we can't use the log file
+	 * We must use the system page size instead of the default size
+	 * if there is not a clean shutdown
+	 */
+	t32 = le32_to_cpu(rst_info.r_page->sys_page_size);
+	if (page_size != t32) {
+		l_size = orig_file_size;
+		page_size =
+			norm_file_page(t32, &l_size, t32 == DefaultLogPageSize);
+	}
+
+	if (page_size != t32 ||
+	    page_size != le32_to_cpu(rst_info.r_page->page_size)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* If the file size has shrunk then we won't mount it */
+	if (l_size < le64_to_cpu(ra2->l_size)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	log_init_pg_hdr(log, page_size, page_size,
+			le16_to_cpu(rst_info.r_page->major_ver),
+			le16_to_cpu(rst_info.r_page->minor_ver));
+
+	log->l_size = le64_to_cpu(ra2->l_size);
+	log->seq_num_bits = le32_to_cpu(ra2->seq_num_bits);
+	log->file_data_bits = sizeof(u64) * 8 - log->seq_num_bits;
+	log->seq_num_mask = (8 << log->file_data_bits) - 1;
+	log->last_lsn = le64_to_cpu(ra2->current_lsn);
+	log->seq_num = log->last_lsn >> log->file_data_bits;
+	log->ra_off = le16_to_cpu(rst_info.r_page->ra_off);
+	log->restart_size = log->sys_page_size - log->ra_off;
+	log->record_header_len = le16_to_cpu(ra2->rec_hdr_len);
+	log->ra_size = le16_to_cpu(ra2->ra_len);
+	log->data_off = le16_to_cpu(ra2->data_off);
+	log->data_size = log->page_size - log->data_off;
+	log->reserved = log->data_size - log->record_header_len;
+
+	vbo = lsn_to_vbo(log, log->last_lsn);
+
+	if (vbo < log->first_page) {
+		/* This is a pseudo lsn */
+		log->l_flags |= NTFSLOG_NO_LAST_LSN;
+		log->next_page = log->first_page;
+		goto find_oldest;
+	}
+
+	/* Find the end of this log record */
+	off = final_log_off(log, log->last_lsn,
+			    le32_to_cpu(ra2->last_lsn_data_len));
+
+	/* If we wrapped the file then increment the sequence number */
+	if (off <= vbo) {
+		log->seq_num += 1;
+		log->l_flags |= NTFSLOG_WRAPPED;
+	}
+
+	/* Now compute the next log page to use */
+	vbo &= ~log->sys_page_mask;
+	tail = log->page_size - (off & log->page_mask) - 1;
+
+	/* If we can fit another log record on the page, move back a page the log file */
+	if (tail >= log->record_header_len) {
+		log->l_flags |= NTFSLOG_REUSE_TAIL;
+		log->next_page = vbo;
+	} else {
+		log->next_page = next_page_off(log, vbo);
+	}
+
+find_oldest:
+	/* Find the oldest client lsn. Use the last flushed lsn as a starting point */
+	log->oldest_lsn = log->last_lsn;
+	oldest_client_lsn(Add2Ptr(ra2, le16_to_cpu(ra2->client_off)),
+			  ra2->client_idx[1], &log->oldest_lsn);
+	log->oldest_lsn_off = lsn_to_vbo(log, log->oldest_lsn);
+
+	if (log->oldest_lsn_off < log->first_page)
+		log->l_flags |= NTFSLOG_NO_OLDEST_LSN;
+
+	if (!(ra2->flags & RESTART_SINGLE_PAGE_IO))
+		log->l_flags |= NTFSLOG_WRAPPED | NTFSLOG_MULTIPLE_PAGE_IO;
+
+	log->current_openlog_count = le32_to_cpu(ra2->open_log_count);
+	log->total_avail_pages = log->l_size - log->first_page;
+	log->total_avail = log->total_avail_pages >> log->page_bits;
+	log->max_current_avail = log->total_avail * log->reserved;
+	log->total_avail = log->total_avail * log->data_size;
+
+	log->current_avail = current_log_avail(log);
+
+	ra = ntfs_zalloc(log->restart_size);
+	if (!ra) {
+		err = -ENOMEM;
+		goto out;
+	}
+	log->ra = ra;
+
+	t16 = le16_to_cpu(ra2->client_off);
+	if (t16 == offsetof(struct RESTART_AREA, clients)) {
+		memcpy(ra, ra2, log->ra_size);
+	} else {
+		memcpy(ra, ra2, offsetof(struct RESTART_AREA, clients));
+		memcpy(ra->clients, Add2Ptr(ra2, t16),
+		       le16_to_cpu(ra2->ra_len) - t16);
+
+		log->current_openlog_count = get_random_int();
+		ra->open_log_count = cpu_to_le32(log->current_openlog_count);
+		log->ra_size = offsetof(struct RESTART_AREA, clients) +
+			       sizeof(struct CLIENT_REC);
+		ra->client_off =
+			cpu_to_le16(offsetof(struct RESTART_AREA, clients));
+		ra->ra_len = cpu_to_le16(log->ra_size);
+	}
+
+	le32_add_cpu(&ra->open_log_count, 1);
+
+	/* Now we need to walk through looking for the last lsn */
+	err = last_log_lsn(log);
+	if (err)
+		goto out;
+
+	log->current_avail = current_log_avail(log);
+
+	/* Remember which restart area to write first */
+	log->init_ra = rst_info.vbo;
+
+process_log:
+	/* 1.0, 1.1, 2.0 log->major_ver/minor_ver - short values */
+	switch ((log->major_ver << 16) + log->minor_ver) {
+	case 0x10000:
+	case 0x10001:
+	case 0x20000:
+		break;
+	default:
+		ntfs_warn(sbi->sb, "$LogFile version %d.%d is not supported",
+			  log->major_ver, log->minor_ver);
+		err = -EOPNOTSUPP;
+		log->set_dirty = true;
+		goto out;
+	}
+
+	/* One client "NTFS" per logfile */
+	ca = Add2Ptr(ra, le16_to_cpu(ra->client_off));
+
+	for (client = ra->client_idx[1];; client = cr->next_client) {
+		if (client == LFS_NO_CLIENT_LE) {
+			/* Insert "NTFS" client LogFile */
+			client = ra->client_idx[0];
+			if (client == LFS_NO_CLIENT_LE)
+				return -EINVAL;
+
+			t16 = le16_to_cpu(client);
+			cr = ca + t16;
+
+			remove_client(ca, cr, &ra->client_idx[0]);
+
+			cr->restart_lsn = 0;
+			cr->oldest_lsn = cpu_to_le64(log->oldest_lsn);
+			cr->name_bytes = cpu_to_le32(8);
+			cr->name[0] = cpu_to_le16('N');
+			cr->name[1] = cpu_to_le16('T');
+			cr->name[2] = cpu_to_le16('F');
+			cr->name[3] = cpu_to_le16('S');
+
+			add_client(ca, t16, &ra->client_idx[1]);
+			break;
+		}
+
+		cr = ca + le16_to_cpu(client);
+
+		if (cpu_to_le32(8) == cr->name_bytes &&
+		    cpu_to_le16('N') == cr->name[0] &&
+		    cpu_to_le16('T') == cr->name[1] &&
+		    cpu_to_le16('F') == cr->name[2] &&
+		    cpu_to_le16('S') == cr->name[3])
+			break;
+	}
+
+	/* Update the client handle with the client block information */
+	log->client_id.seq_num = cr->seq_num;
+	log->client_id.client_idx = client;
+
+	err = read_rst_area(log, &rst, &ra_lsn);
+	if (err)
+		goto out;
+
+	if (!rst)
+		goto out;
+
+	bytes_per_attr_entry = !rst->major_ver ? 0x2C : 0x28;
+
+	checkpt_lsn = le64_to_cpu(rst->check_point_start);
+	if (!checkpt_lsn)
+		checkpt_lsn = ra_lsn;
+
+	/* Allocate and Read the Transaction Table */
+	if (!rst->transact_table_len)
+		goto check_dirty_page_table;
+
+	t64 = le64_to_cpu(rst->transact_table_lsn);
+	err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+	if (err)
+		goto out;
+
+	lrh = lcb->log_rec;
+	frh = lcb->lrh;
+	rec_len = le32_to_cpu(frh->client_data_len);
+
+	if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+			   bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	t16 = le16_to_cpu(lrh->redo_off);
+
+	rt = Add2Ptr(lrh, t16);
+	t32 = rec_len - t16;
+
+	/* Now check that this is a valid restart table */
+	if (!check_rstbl(rt, t32)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	trtbl = ntfs_memdup(rt, t32);
+	if (!trtbl) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	lcb_put(lcb);
+	lcb = NULL;
+
+check_dirty_page_table:
+	/* The next record back should be the Dirty Pages Table */
+	if (!rst->dirty_pages_len)
+		goto check_attribute_names;
+
+	t64 = le64_to_cpu(rst->dirty_pages_table_lsn);
+	err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+	if (err)
+		goto out;
+
+	lrh = lcb->log_rec;
+	frh = lcb->lrh;
+	rec_len = le32_to_cpu(frh->client_data_len);
+
+	if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+			   bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	t16 = le16_to_cpu(lrh->redo_off);
+
+	rt = Add2Ptr(lrh, t16);
+	t32 = rec_len - t16;
+
+	/* Now check that this is a valid restart table */
+	if (!check_rstbl(rt, t32)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	dptbl = ntfs_memdup(rt, t32);
+	if (!dptbl) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	/* Convert Ra version '0' into version '1' */
+	if (rst->major_ver)
+		goto end_conv_1;
+
+	dp = NULL;
+	while ((dp = enum_rstbl(dptbl, dp))) {
+		struct DIR_PAGE_ENTRY_32 *dp0 = (struct DIR_PAGE_ENTRY_32 *)dp;
+		// NOTE: Danger. Check for of boundary
+		memmove(&dp->vcn, &dp0->vcn_low,
+			2 * sizeof(u64) +
+				le32_to_cpu(dp->lcns_follow) * sizeof(u64));
+	}
+
+end_conv_1:
+	lcb_put(lcb);
+	lcb = NULL;
+
+	/* Go through the table and remove the duplicates, remembering the oldest lsn values */
+	if (sbi->cluster_size <= log->page_size)
+		goto trace_dp_table;
+
+	dp = NULL;
+	while ((dp = enum_rstbl(dptbl, dp))) {
+		struct DIR_PAGE_ENTRY *next = dp;
+
+		while ((next = enum_rstbl(dptbl, next))) {
+			if (next->target_attr == dp->target_attr &&
+			    next->vcn == dp->vcn) {
+				if (le64_to_cpu(next->oldest_lsn) <
+				    le64_to_cpu(dp->oldest_lsn)) {
+					dp->oldest_lsn = next->oldest_lsn;
+				}
+
+				free_rsttbl_idx(dptbl, PtrOffset(dptbl, next));
+			}
+		}
+	}
+trace_dp_table:
+check_attribute_names:
+	/* The next record should be the Attribute Names */
+	if (!rst->attr_names_len)
+		goto check_attr_table;
+
+	t64 = le64_to_cpu(rst->attr_names_lsn);
+	err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+	if (err)
+		goto out;
+
+	lrh = lcb->log_rec;
+	frh = lcb->lrh;
+	rec_len = le32_to_cpu(frh->client_data_len);
+
+	if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+			   bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	t32 = lrh_length(lrh);
+	rec_len -= t32;
+
+	attr_names = ntfs_memdup(Add2Ptr(lrh, t32), rec_len);
+
+	lcb_put(lcb);
+	lcb = NULL;
+
+check_attr_table:
+	/* The next record should be the attribute Table */
+	if (!rst->open_attr_len)
+		goto check_attribute_names2;
+
+	t64 = le64_to_cpu(rst->open_attr_table_lsn);
+	err = read_log_rec_lcb(log, t64, lcb_ctx_prev, &lcb);
+	if (err)
+		goto out;
+
+	lrh = lcb->log_rec;
+	frh = lcb->lrh;
+	rec_len = le32_to_cpu(frh->client_data_len);
+
+	if (!check_log_rec(lrh, rec_len, le32_to_cpu(frh->transact_id),
+			   bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	t16 = le16_to_cpu(lrh->redo_off);
+
+	rt = Add2Ptr(lrh, t16);
+	t32 = rec_len - t16;
+
+	if (!check_rstbl(rt, t32)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	oatbl = ntfs_memdup(rt, t32);
+	if (!oatbl) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	log->open_attr_tbl = oatbl;
+
+	/* Clear all of the Attr pointers */
+	oe = NULL;
+	while ((oe = enum_rstbl(oatbl, oe))) {
+		if (!rst->major_ver) {
+			struct OPEN_ATTR_ENRTY_32 oe0;
+
+			/* Really 'oe' points to OPEN_ATTR_ENRTY_32 */
+			memcpy(&oe0, oe, SIZEOF_OPENATTRIBUTEENTRY0);
+
+			oe->bytes_per_index = oe0.bytes_per_index;
+			oe->type = oe0.type;
+			oe->is_dirty_pages = oe0.is_dirty_pages;
+			oe->name_len = 0;
+			oe->ref = oe0.ref;
+			oe->open_record_lsn = oe0.open_record_lsn;
+		}
+
+		oe->is_attr_name = 0;
+		oe->ptr = NULL;
+	}
+
+	lcb_put(lcb);
+	lcb = NULL;
+
+check_attribute_names2:
+	if (!rst->attr_names_len)
+		goto trace_attribute_table;
+
+	ane = attr_names;
+	if (!oatbl)
+		goto trace_attribute_table;
+	while (ane->off) {
+		/* TODO: Clear table on exit! */
+		oe = Add2Ptr(oatbl, le16_to_cpu(ane->off));
+		t16 = le16_to_cpu(ane->name_bytes);
+		oe->name_len = t16 / sizeof(short);
+		oe->ptr = ane->name;
+		oe->is_attr_name = 2;
+		ane = Add2Ptr(ane, sizeof(struct ATTR_NAME_ENTRY) + t16);
+	}
+
+trace_attribute_table:
+	/*
+	 * If the checkpt_lsn is zero, then this is a freshly
+	 * formatted disk and we have no work to do
+	 */
+	if (!checkpt_lsn) {
+		err = 0;
+		goto out;
+	}
+
+	if (!oatbl) {
+		oatbl = init_rsttbl(bytes_per_attr_entry, 8);
+		if (!oatbl) {
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+
+	log->open_attr_tbl = oatbl;
+
+	/* Start the analysis pass from the Checkpoint lsn. */
+	rec_lsn = checkpt_lsn;
+
+	/* Read the first lsn */
+	err = read_log_rec_lcb(log, checkpt_lsn, lcb_ctx_next, &lcb);
+	if (err)
+		goto out;
+
+	/* Loop to read all subsequent records to the end of the log file */
+next_log_record_analyze:
+	err = read_next_log_rec(log, lcb, &rec_lsn);
+	if (err)
+		goto out;
+
+	if (!rec_lsn)
+		goto end_log_records_enumerate;
+
+	frh = lcb->lrh;
+	transact_id = le32_to_cpu(frh->transact_id);
+	rec_len = le32_to_cpu(frh->client_data_len);
+	lrh = lcb->log_rec;
+
+	if (!check_log_rec(lrh, rec_len, transact_id, bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/*
+	 * The first lsn after the previous lsn remembered
+	 * the checkpoint is the first candidate for the rlsn
+	 */
+	if (!rlsn)
+		rlsn = rec_lsn;
+
+	if (LfsClientRecord != frh->record_type)
+		goto next_log_record_analyze;
+
+	/*
+	 * Now update the Transaction Table for this transaction
+	 * If there is no entry present or it is unallocated we allocate the entry
+	 */
+	if (!trtbl) {
+		trtbl = init_rsttbl(sizeof(struct TRANSACTION_ENTRY),
+				    INITIAL_NUMBER_TRANSACTIONS);
+		if (!trtbl) {
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+
+	tr = Add2Ptr(trtbl, transact_id);
+
+	if (transact_id >= bytes_per_rt(trtbl) ||
+	    tr->next != RESTART_ENTRY_ALLOCATED_LE) {
+		tr = alloc_rsttbl_from_idx(&trtbl, transact_id);
+		if (!tr) {
+			err = -ENOMEM;
+			goto out;
+		}
+		tr->transact_state = TransactionActive;
+		tr->first_lsn = cpu_to_le64(rec_lsn);
+	}
+
+	tr->prev_lsn = tr->undo_next_lsn = cpu_to_le64(rec_lsn);
+
+	/*
+	 * If this is a compensation log record, then change
+	 * the undo_next_lsn to be the undo_next_lsn of this record
+	 */
+	if (lrh->undo_op == cpu_to_le16(CompensationLogRecord))
+		tr->undo_next_lsn = frh->client_undo_next_lsn;
+
+	/* Dispatch to handle log record depending on type */
+	switch (le16_to_cpu(lrh->redo_op)) {
+	case InitializeFileRecordSegment:
+	case DeallocateFileRecordSegment:
+	case WriteEndOfFileRecordSegment:
+	case CreateAttribute:
+	case DeleteAttribute:
+	case UpdateResidentValue:
+	case UpdateNonresidentValue:
+	case UpdateMappingPairs:
+	case SetNewAttributeSizes:
+	case AddIndexEntryRoot:
+	case DeleteIndexEntryRoot:
+	case AddIndexEntryAllocation:
+	case DeleteIndexEntryAllocation:
+	case WriteEndOfIndexBuffer:
+	case SetIndexEntryVcnRoot:
+	case SetIndexEntryVcnAllocation:
+	case UpdateFileNameRoot:
+	case UpdateFileNameAllocation:
+	case SetBitsInNonresidentBitMap:
+	case ClearBitsInNonresidentBitMap:
+	case UpdateRecordDataRoot:
+	case UpdateRecordDataAllocation:
+	case ZeroEndOfFileRecord:
+		t16 = le16_to_cpu(lrh->target_attr);
+		t64 = le64_to_cpu(lrh->target_vcn);
+		dp = find_dp(dptbl, t16, t64);
+
+		if (dp)
+			goto copy_lcns;
+
+		/*
+		 * Calculate the number of clusters per page the system
+		 * which wrote the checkpoint, possibly creating the table
+		 */
+		if (dptbl) {
+			t32 = 1 + (le16_to_cpu(dptbl->size) -
+				   sizeof(struct DIR_PAGE_ENTRY)) /
+					  sizeof(u64);
+		} else {
+			t32 = log->clst_per_page;
+			ntfs_free(dptbl);
+			dptbl = init_rsttbl(sizeof(struct DIR_PAGE_ENTRY) +
+						    (t32 - 1) * sizeof(u64),
+					    32);
+			if (!dptbl) {
+				err = -ENOMEM;
+				goto out;
+			}
+		}
+
+		dp = alloc_rsttbl_idx(&dptbl);
+		dp->target_attr = cpu_to_le32(t16);
+		dp->transfer_len = cpu_to_le32(t32 << sbi->cluster_bits);
+		dp->lcns_follow = cpu_to_le32(t32);
+		dp->vcn = cpu_to_le64(t64 & ~((u64)t32 - 1));
+		dp->oldest_lsn = cpu_to_le64(rec_lsn);
+
+copy_lcns:
+		/*
+		 * Copy the Lcns from the log record into the Dirty Page Entry
+		 * TODO: for different page size support, must somehow make
+		 * whole routine a loop, case Lcns do not fit below
+		 */
+		t16 = le16_to_cpu(lrh->lcns_follow);
+		for (i = 0; i < t16; i++) {
+			size_t j = (size_t)(le64_to_cpu(lrh->target_vcn) -
+					    le64_to_cpu(dp->vcn));
+			dp->page_lcns[j + i] = lrh->page_lcns[i];
+		}
+
+		goto next_log_record_analyze;
+
+	case DeleteDirtyClusters: {
+		u32 range_count =
+			le16_to_cpu(lrh->redo_len) / sizeof(struct LCN_RANGE);
+		const struct LCN_RANGE *r =
+			Add2Ptr(lrh, le16_to_cpu(lrh->redo_off));
+
+		/* Loop through all of the Lcn ranges this log record */
+		for (i = 0; i < range_count; i++, r++) {
+			u64 lcn0 = le64_to_cpu(r->lcn);
+			u64 lcn_e = lcn0 + le64_to_cpu(r->len) - 1;
+
+			dp = NULL;
+			while ((dp = enum_rstbl(dptbl, dp))) {
+				u32 j;
+
+				t32 = le32_to_cpu(dp->lcns_follow);
+				for (j = 0; j < t32; j++) {
+					t64 = le64_to_cpu(dp->page_lcns[j]);
+					if (t64 >= lcn0 && t64 <= lcn_e)
+						dp->page_lcns[j] = 0;
+				}
+			}
+		}
+		goto next_log_record_analyze;
+		;
+	}
+
+	case OpenNonresidentAttribute:
+		t16 = le16_to_cpu(lrh->target_attr);
+		if (t16 >= bytes_per_rt(oatbl)) {
+			/*
+			 * Compute how big the table needs to be.
+			 * Add 10 extra entries for some cushion
+			 */
+			u32 new_e = t16 / le16_to_cpu(oatbl->size);
+
+			new_e += 10 - le16_to_cpu(oatbl->used);
+
+			oatbl = extend_rsttbl(oatbl, new_e, ~0u);
+			log->open_attr_tbl = oatbl;
+			if (!oatbl) {
+				err = -ENOMEM;
+				goto out;
+			}
+		}
+
+		/* Point to the entry being opened */
+		oe = alloc_rsttbl_from_idx(&oatbl, t16);
+		log->open_attr_tbl = oatbl;
+		if (!oe) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		/* Initialize this entry from the log record */
+		t16 = le16_to_cpu(lrh->redo_off);
+		if (!rst->major_ver) {
+			/* Convert version '0' into version '1' */
+			struct OPEN_ATTR_ENRTY_32 *oe0 = Add2Ptr(lrh, t16);
+
+			oe->bytes_per_index = oe0->bytes_per_index;
+			oe->type = oe0->type;
+			oe->is_dirty_pages = oe0->is_dirty_pages;
+			oe->name_len = 0; //oe0.name_len;
+			oe->ref = oe0->ref;
+			oe->open_record_lsn = oe0->open_record_lsn;
+		} else {
+			memcpy(oe, Add2Ptr(lrh, t16), bytes_per_attr_entry);
+		}
+
+		t16 = le16_to_cpu(lrh->undo_len);
+		if (t16) {
+			oe->ptr = ntfs_malloc(t16);
+			if (!oe->ptr) {
+				err = -ENOMEM;
+				goto out;
+			}
+			oe->name_len = t16 / sizeof(short);
+			memcpy(oe->ptr,
+			       Add2Ptr(lrh, le16_to_cpu(lrh->undo_off)), t16);
+			oe->is_attr_name = 1;
+		} else {
+			oe->ptr = NULL;
+			oe->is_attr_name = 0;
+		}
+
+		goto next_log_record_analyze;
+
+	case HotFix:
+		t16 = le16_to_cpu(lrh->target_attr);
+		t64 = le64_to_cpu(lrh->target_vcn);
+		dp = find_dp(dptbl, t16, t64);
+		if (dp) {
+			size_t j = le64_to_cpu(lrh->target_vcn) -
+				   le64_to_cpu(dp->vcn);
+			if (dp->page_lcns[j])
+				dp->page_lcns[j] = lrh->page_lcns[0];
+		}
+		goto next_log_record_analyze;
+
+	case EndTopLevelAction:
+		tr = Add2Ptr(trtbl, transact_id);
+		tr->prev_lsn = cpu_to_le64(rec_lsn);
+		tr->undo_next_lsn = frh->client_undo_next_lsn;
+		goto next_log_record_analyze;
+
+	case PrepareTransaction:
+		tr = Add2Ptr(trtbl, transact_id);
+		tr->transact_state = TransactionPrepared;
+		goto next_log_record_analyze;
+
+	case CommitTransaction:
+		tr = Add2Ptr(trtbl, transact_id);
+		tr->transact_state = TransactionCommitted;
+		goto next_log_record_analyze;
+
+	case ForgetTransaction:
+		free_rsttbl_idx(trtbl, transact_id);
+		goto next_log_record_analyze;
+
+	case Noop:
+	case OpenAttributeTableDump:
+	case AttributeNamesDump:
+	case DirtyPageTableDump:
+	case TransactionTableDump:
+		/* The following cases require no action the Analysis Pass */
+		goto next_log_record_analyze;
+
+	default:
+		/*
+		 * All codes will be explicitly handled.
+		 * If we see a code we do not expect, then we are trouble
+		 */
+		goto next_log_record_analyze;
+	}
+
+end_log_records_enumerate:
+	lcb_put(lcb);
+	lcb = NULL;
+
+	/*
+	 * Scan the Dirty Page Table and Transaction Table for
+	 * the lowest lsn, and return it as the Redo lsn
+	 */
+	dp = NULL;
+	while ((dp = enum_rstbl(dptbl, dp))) {
+		t64 = le64_to_cpu(dp->oldest_lsn);
+		if (t64 && t64 < rlsn)
+			rlsn = t64;
+	}
+
+	tr = NULL;
+	while ((tr = enum_rstbl(trtbl, tr))) {
+		t64 = le64_to_cpu(tr->first_lsn);
+		if (t64 && t64 < rlsn)
+			rlsn = t64;
+	}
+
+	/* Only proceed if the Dirty Page Table or Transaction table are not empty */
+	if ((!dptbl || !dptbl->total) && (!trtbl || !trtbl->total))
+		goto end_reply;
+
+	sbi->flags |= NTFS_FLAGS_NEED_REPLAY;
+	if (is_ro)
+		goto out;
+
+	/* Reopen all of the attributes with dirty pages */
+	oe = NULL;
+next_open_attribute:
+
+	oe = enum_rstbl(oatbl, oe);
+	if (!oe) {
+		err = 0;
+		dp = NULL;
+		goto next_dirty_page;
+	}
+
+	oa = ntfs_zalloc(sizeof(struct OpenAttr));
+	if (!oa) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	inode = ntfs_iget5(sbi->sb, &oe->ref, NULL);
+	if (IS_ERR(inode))
+		goto fake_attr;
+
+	if (is_bad_inode(inode)) {
+		iput(inode);
+fake_attr:
+		if (oa->ni) {
+			iput(&oa->ni->vfs_inode);
+			oa->ni = NULL;
+		}
+
+		attr = attr_create_nonres_log(sbi, oe->type, 0, oe->ptr,
+					      oe->name_len, 0);
+		if (!attr) {
+			ntfs_free(oa);
+			err = -ENOMEM;
+			goto out;
+		}
+		oa->attr = attr;
+		oa->run1 = &oa->run0;
+		goto final_oe;
+	}
+
+	ni_oe = ntfs_i(inode);
+	oa->ni = ni_oe;
+
+	attr = ni_find_attr(ni_oe, NULL, NULL, oe->type, oe->ptr, oe->name_len,
+			    NULL, NULL);
+
+	if (!attr)
+		goto fake_attr;
+
+	t32 = le32_to_cpu(attr->size);
+	oa->attr = ntfs_memdup(attr, t32);
+	if (!oa->attr)
+		goto fake_attr;
+
+	if (!S_ISDIR(inode->i_mode)) {
+		if (attr->type == ATTR_DATA && !attr->name_len) {
+			oa->run1 = &ni_oe->file.run;
+			goto final_oe;
+		}
+	} else {
+		if (attr->type == ATTR_ALLOC &&
+		    attr->name_len == ARRAY_SIZE(I30_NAME) &&
+		    !memcmp(attr_name(attr), I30_NAME, sizeof(I30_NAME))) {
+			oa->run1 = &ni_oe->dir.alloc_run;
+			goto final_oe;
+		}
+	}
+
+	if (attr->non_res) {
+		u16 roff = le16_to_cpu(attr->nres.run_off);
+		CLST svcn = le64_to_cpu(attr->nres.svcn);
+
+		err = run_unpack(&oa->run0, sbi, inode->i_ino, svcn,
+				 le64_to_cpu(attr->nres.evcn), svcn,
+				 Add2Ptr(attr, roff), t32 - roff);
+		if (err < 0) {
+			ntfs_free(oa->attr);
+			oa->attr = NULL;
+			goto fake_attr;
+		}
+		err = 0;
+	}
+	oa->run1 = &oa->run0;
+	attr = oa->attr;
+
+final_oe:
+	if (oe->is_attr_name == 1)
+		ntfs_free(oe->ptr);
+	oe->is_attr_name = 0;
+	oe->ptr = oa;
+	oe->name_len = attr->name_len;
+
+	goto next_open_attribute;
+
+	/*
+	 * Now loop through the dirty page table to extract all of the Vcn/Lcn
+	 * Mapping that we have, and insert it into the appropriate run
+	 */
+next_dirty_page:
+	dp = enum_rstbl(dptbl, dp);
+	if (!dp)
+		goto do_redo_1;
+
+	oe = Add2Ptr(oatbl, le32_to_cpu(dp->target_attr));
+
+	if (oe->next != RESTART_ENTRY_ALLOCATED_LE)
+		goto next_dirty_page;
+
+	oa = oe->ptr;
+	if (!oa)
+		goto next_dirty_page;
+
+	i = -1;
+next_dirty_page_vcn:
+	i += 1;
+	if (i >= le32_to_cpu(dp->lcns_follow))
+		goto next_dirty_page;
+
+	vcn = le64_to_cpu(dp->vcn) + i;
+	size = (vcn + 1) << sbi->cluster_bits;
+
+	if (!dp->page_lcns[i])
+		goto next_dirty_page_vcn;
+
+	rno = ino_get(&oe->ref);
+	if (rno <= MFT_REC_MIRR &&
+	    size < (MFT_REC_VOL + 1) * sbi->record_size &&
+	    oe->type == ATTR_DATA) {
+		goto next_dirty_page_vcn;
+	}
+
+	lcn = le64_to_cpu(dp->page_lcns[i]);
+
+	if ((!run_lookup_entry(oa->run1, vcn, &lcn0, &len0, NULL) ||
+	     lcn0 != lcn) &&
+	    !run_add_entry(oa->run1, vcn, lcn, 1, false)) {
+		err = -ENOMEM;
+		goto out;
+	}
+	attr = oa->attr;
+	t64 = le64_to_cpu(attr->nres.alloc_size);
+	if (size > t64) {
+		attr->nres.valid_size = attr->nres.data_size =
+			attr->nres.alloc_size = cpu_to_le64(size);
+	}
+	goto next_dirty_page_vcn;
+
+do_redo_1:
+	/*
+	 * Perform the Redo Pass, to restore all of the dirty pages to the same
+	 * contents that they had immediately before the crash
+	 * If the dirty page table is empty, then we can skip the entire Redo Pass
+	 */
+	if (!dptbl || !dptbl->total)
+		goto do_undo_action;
+
+	rec_lsn = rlsn;
+
+	/*
+	 * Read the record at the Redo lsn, before falling
+	 * into common code to handle each record
+	 */
+	err = read_log_rec_lcb(log, rlsn, lcb_ctx_next, &lcb);
+	if (err)
+		goto out;
+
+	/*
+	 * Now loop to read all of our log records forwards,
+	 * until we hit the end of the file, cleaning up at the end
+	 */
+do_action_next:
+	frh = lcb->lrh;
+
+	if (LfsClientRecord != frh->record_type)
+		goto read_next_log_do_action;
+
+	transact_id = le32_to_cpu(frh->transact_id);
+	rec_len = le32_to_cpu(frh->client_data_len);
+	lrh = lcb->log_rec;
+
+	if (!check_log_rec(lrh, rec_len, transact_id, bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Ignore log records that do not update pages */
+	if (lrh->lcns_follow)
+		goto find_dirty_page;
+
+	goto read_next_log_do_action;
+
+find_dirty_page:
+	t16 = le16_to_cpu(lrh->target_attr);
+	t64 = le64_to_cpu(lrh->target_vcn);
+	dp = find_dp(dptbl, t16, t64);
+
+	if (!dp)
+		goto read_next_log_do_action;
+
+	if (rec_lsn < le64_to_cpu(dp->oldest_lsn))
+		goto read_next_log_do_action;
+
+	t16 = le16_to_cpu(lrh->target_attr);
+	if (t16 >= bytes_per_rt(oatbl)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	oe = Add2Ptr(oatbl, t16);
+
+	if (oe->next != RESTART_ENTRY_ALLOCATED_LE) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	oa = oe->ptr;
+
+	if (!oa) {
+		err = -EINVAL;
+		goto out;
+	}
+	attr = oa->attr;
+
+	vcn = le64_to_cpu(lrh->target_vcn);
+
+	if (!run_lookup_entry(oa->run1, vcn, &lcn, NULL, NULL) ||
+	    lcn == SPARSE_LCN) {
+		goto read_next_log_do_action;
+	}
+
+	/* Point to the Redo data and get its length */
+	data = Add2Ptr(lrh, le16_to_cpu(lrh->redo_off));
+	dlen = le16_to_cpu(lrh->redo_len);
+
+	/* Shorten length by any Lcns which were deleted */
+	saved_len = dlen;
+
+	for (i = le16_to_cpu(lrh->lcns_follow); i; i--) {
+		size_t j;
+		u32 alen, voff;
+
+		voff = le16_to_cpu(lrh->record_off) +
+		       le16_to_cpu(lrh->attr_off);
+		voff += le16_to_cpu(lrh->cluster_off) << SECTOR_SHIFT;
+
+		/* If the Vcn question is allocated, we can just get out.*/
+		j = le64_to_cpu(lrh->target_vcn) - le64_to_cpu(dp->vcn);
+		if (dp->page_lcns[j + i - 1])
+			break;
+
+		if (!saved_len)
+			saved_len = 1;
+
+		/*
+		 * Calculate the allocated space left relative to the
+		 * log record Vcn, after removing this unallocated Vcn
+		 */
+		alen = (i - 1) << sbi->cluster_bits;
+
+		/*
+		 * If the update described this log record goes beyond
+		 * the allocated space, then we will have to reduce the length
+		 */
+		if (voff >= alen)
+			dlen = 0;
+		else if (voff + dlen > alen)
+			dlen = alen - voff;
+	}
+
+	/* If the resulting dlen from above is now zero, we can skip this log record */
+	if (!dlen && saved_len)
+		goto read_next_log_do_action;
+
+	t16 = le16_to_cpu(lrh->redo_op);
+	if (can_skip_action(t16))
+		goto read_next_log_do_action;
+
+	/* Apply the Redo operation a common routine */
+	err = do_action(log, oe, lrh, t16, data, dlen, rec_len, &rec_lsn);
+	if (err)
+		goto out;
+
+	/* Keep reading and looping back until end of file */
+read_next_log_do_action:
+	err = read_next_log_rec(log, lcb, &rec_lsn);
+	if (!err && rec_lsn)
+		goto do_action_next;
+
+	lcb_put(lcb);
+	lcb = NULL;
+
+do_undo_action:
+	/* Scan Transaction Table */
+	tr = NULL;
+transaction_table_next:
+	tr = enum_rstbl(trtbl, tr);
+	if (!tr)
+		goto undo_action_done;
+
+	if (TransactionActive != tr->transact_state || !tr->undo_next_lsn) {
+		free_rsttbl_idx(trtbl, PtrOffset(trtbl, tr));
+		goto transaction_table_next;
+	}
+
+	log->transaction_id = PtrOffset(trtbl, tr);
+	undo_next_lsn = le64_to_cpu(tr->undo_next_lsn);
+
+	/*
+	 * We only have to do anything if the transaction has
+	 * something its undo_next_lsn field
+	 */
+	if (!undo_next_lsn)
+		goto commit_undo;
+
+	/* Read the first record to be undone by this transaction */
+	err = read_log_rec_lcb(log, undo_next_lsn, lcb_ctx_undo_next, &lcb);
+	if (err)
+		goto out;
+
+	/*
+	 * Now loop to read all of our log records forwards,
+	 * until we hit the end of the file, cleaning up at the end
+	 */
+undo_action_next:
+
+	lrh = lcb->log_rec;
+	frh = lcb->lrh;
+	transact_id = le32_to_cpu(frh->transact_id);
+	rec_len = le32_to_cpu(frh->client_data_len);
+
+	if (!check_log_rec(lrh, rec_len, transact_id, bytes_per_attr_entry)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (lrh->undo_op == cpu_to_le16(Noop))
+		goto read_next_log_undo_action;
+
+	oe = Add2Ptr(oatbl, le16_to_cpu(lrh->target_attr));
+	oa = oe->ptr;
+
+	t16 = le16_to_cpu(lrh->lcns_follow);
+	if (!t16)
+		goto add_allocated_vcns;
+
+	is_mapped = run_lookup_entry(oa->run1, le64_to_cpu(lrh->target_vcn),
+				     &lcn, &clen, NULL);
+
+	/*
+	 * If the mapping isn't already the table or the  mapping
+	 * corresponds to a hole the mapping, we need to make sure
+	 * there is no partial page already memory
+	 */
+	if (is_mapped && lcn != SPARSE_LCN && clen >= t16)
+		goto add_allocated_vcns;
+
+	vcn = le64_to_cpu(lrh->target_vcn);
+	vcn &= ~(log->clst_per_page - 1);
+
+add_allocated_vcns:
+	for (i = 0, vcn = le64_to_cpu(lrh->target_vcn),
+	    size = (vcn + 1) << sbi->cluster_bits;
+	     i < t16; i++, vcn += 1, size += sbi->cluster_size) {
+		attr = oa->attr;
+		if (!attr->non_res) {
+			if (size > le32_to_cpu(attr->res.data_size))
+				attr->res.data_size = cpu_to_le32(size);
+		} else {
+			if (size > le64_to_cpu(attr->nres.data_size))
+				attr->nres.valid_size = attr->nres.data_size =
+					attr->nres.alloc_size =
+						cpu_to_le64(size);
+		}
+	}
+
+	t16 = le16_to_cpu(lrh->undo_op);
+	if (can_skip_action(t16))
+		goto read_next_log_undo_action;
+
+	/* Point to the Redo data and get its length */
+	data = Add2Ptr(lrh, le16_to_cpu(lrh->undo_off));
+	dlen = le16_to_cpu(lrh->undo_len);
+
+	/* it is time to apply the undo action */
+	err = do_action(log, oe, lrh, t16, data, dlen, rec_len, NULL);
+
+read_next_log_undo_action:
+	/*
+	 * Keep reading and looping back until we have read the
+	 * last record for this transaction
+	 */
+	err = read_next_log_rec(log, lcb, &rec_lsn);
+	if (err)
+		goto out;
+
+	if (rec_lsn)
+		goto undo_action_next;
+
+commit_undo:
+	free_rsttbl_idx(trtbl, log->transaction_id);
+
+	log->transaction_id = 0;
+
+	goto transaction_table_next;
+
+undo_action_done:
+
+	ntfs_update_mftmirr(sbi, 0);
+
+	sbi->flags &= ~NTFS_FLAGS_NEED_REPLAY;
+
+end_reply:
+
+	err = 0;
+	if (is_ro)
+		goto out;
+
+	rh = ntfs_zalloc(log->page_size);
+	if (!rh) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	rh->rhdr.sign = NTFS_RSTR_SIGNATURE;
+	rh->rhdr.fix_off = cpu_to_le16(offsetof(struct RESTART_HDR, fixups));
+	t16 = (log->page_size >> SECTOR_SHIFT) + 1;
+	rh->rhdr.fix_num = cpu_to_le16(t16);
+	rh->sys_page_size = cpu_to_le32(log->page_size);
+	rh->page_size = cpu_to_le32(log->page_size);
+
+	t16 = QuadAlign(offsetof(struct RESTART_HDR, fixups) +
+			sizeof(short) * t16);
+	rh->ra_off = cpu_to_le16(t16);
+	rh->minor_ver = cpu_to_le16(1); // 0x1A:
+	rh->major_ver = cpu_to_le16(1); // 0x1C:
+
+	ra2 = Add2Ptr(rh, t16);
+	memcpy(ra2, ra, sizeof(struct RESTART_AREA));
+
+	ra2->client_idx[0] = 0;
+	ra2->client_idx[1] = LFS_NO_CLIENT_LE;
+	ra2->flags = cpu_to_le16(2);
+
+	le32_add_cpu(&ra2->open_log_count, 1);
+
+	ntfs_fix_pre_write(&rh->rhdr, log->page_size);
+
+	err = ntfs_sb_write_run(sbi, &ni->file.run, 0, rh, log->page_size);
+	if (!err)
+		err = ntfs_sb_write_run(sbi, &log->ni->file.run, log->page_size,
+					rh, log->page_size);
+
+	ntfs_free(rh);
+	if (err)
+		goto out;
+
+out:
+	ntfs_free(rst);
+	if (lcb)
+		lcb_put(lcb);
+
+	/* Scan the Open Attribute Table to close all of the open attributes */
+	oe = NULL;
+	while ((oe = enum_rstbl(oatbl, oe))) {
+		rno = ino_get(&oe->ref);
+
+		if (oe->is_attr_name == 1) {
+			ntfs_free(oe->ptr);
+			oe->ptr = NULL;
+			continue;
+		}
+
+		if (oe->is_attr_name)
+			continue;
+
+		oa = oe->ptr;
+		if (!oa)
+			continue;
+
+		run_close(&oa->run0);
+		ntfs_free(oa->attr);
+		if (oa->ni)
+			iput(&oa->ni->vfs_inode);
+		ntfs_free(oa);
+	}
+
+	ntfs_free(trtbl);
+	ntfs_free(oatbl);
+	ntfs_free(dptbl);
+	ntfs_free(attr_names);
+	ntfs_free(rst_info.r_page);
+
+	ntfs_free(ra);
+	ntfs_free(log->one_page_buf);
+
+	if (err)
+		sbi->flags |= NTFS_FLAGS_NEED_REPLAY;
+
+	if (err == -EROFS)
+		err = 0;
+	else if (log->set_dirty)
+		ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
+
+	ntfs_free(log);
+
+	return err;
+}

From patchwork Thu Jan 28 09:04:53 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053219
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 0F08FC433E0
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:58 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id C830864DD1
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:57 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231713AbhA1JIq (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:46 -0500
Received: from relaydlg-01.paragon-software.com ([81.5.88.159]:37296 "EHLO
        relaydlg-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231267AbhA1JGY (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:24 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relaydlg-01.paragon-software.com (Postfix) with ESMTPS id
 CB52B8230B;
        Thu, 28 Jan 2021 12:05:04 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824704;
        bh=TaTxJZ4dKHZvKF0oNwlUWpRd4Bh/dxfeqyfwXR0NB+s=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=csLiLhdZa4vt7H2hH9eReOKhj7armRg3oqToFb+Jts0/H3jhTTSRHxcIQ8Pfr+PLp
         WYKW7r5a5ZDzWk4q/aN9SptHYU2TvnkIgX1fwEFX2kZUt/2I7y7iBc3sveJoRH0Aoa
         pEkZBFyGfyKoFzKGxpM8Zk+BfMKTxBovWtljLxc4=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:04 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 08/10] fs/ntfs3: Add Kconfig, Makefile and doc
Date: Thu, 28 Jan 2021 12:04:53 +0300
Message-ID: 
 <20210128090455.3576502-9-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds Kconfig, Makefile and doc

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 Documentation/filesystems/ntfs3.rst | 107 ++++++++++++++++++++++++++++
 fs/ntfs3/Kconfig                    |  45 ++++++++++++
 fs/ntfs3/Makefile                   |  31 ++++++++
 3 files changed, 183 insertions(+)
 create mode 100644 Documentation/filesystems/ntfs3.rst
 create mode 100644 fs/ntfs3/Kconfig
 create mode 100644 fs/ntfs3/Makefile

diff --git a/Documentation/filesystems/ntfs3.rst b/Documentation/filesystems/ntfs3.rst
new file mode 100644
index 000000000000..fb29067360cc
--- /dev/null
+++ b/Documentation/filesystems/ntfs3.rst
@@ -0,0 +1,107 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====
+NTFS3
+=====
+
+
+Summary and Features
+====================
+
+NTFS3 is fully functional NTFS Read-Write driver. The driver works with
+NTFS versions up to 3.1, normal/compressed/sparse files
+and journal replaying. File system type to use on mount is 'ntfs3'.
+
+- This driver implements NTFS read/write support for normal, sparse and
+  compressed files.
+- Supports native journal replaying;
+- Supports extended attributes
+	Predefined extended attributes:
+	- 'system.ntfs_security' gets/sets security
+			descriptor (SECURITY_DESCRIPTOR_RELATIVE)
+	- 'system.ntfs_attrib' gets/sets ntfs file/dir attributes.
+		Note: applied to empty files, this allows to switch type between
+		sparse(0x200), compressed(0x800) and normal;
+- Supports NFS export of mounted NTFS volumes.
+
+Mount Options
+=============
+
+The list below describes mount options supported by NTFS3 driver in addition to
+generic ones.
+
+===============================================================================
+
+nls=name		This option informs the driver how to interpret path
+			strings and translate them to Unicode and back. If
+			this option is not set, the default codepage will be
+			used (CONFIG_NLS_DEFAULT).
+			Examples:
+				'nls=utf8'
+
+uid=
+gid=
+umask=			Controls the default permissions for files/directories created
+			after the NTFS volume is mounted.
+
+fmask=
+dmask=			Instead of specifying umask which applies both to
+			files and directories, fmask applies only to files and
+			dmask only to directories.
+
+nohidden		Files with the Windows-specific HIDDEN (FILE_ATTRIBUTE_HIDDEN)
+			attribute will not be shown under Linux.
+
+sys_immutable		Files with the Windows-specific SYSTEM
+			(FILE_ATTRIBUTE_SYSTEM) attribute will be marked as system
+			immutable files.
+
+discard			Enable support of the TRIM command for improved performance
+			on delete operations, which is recommended for use with the
+			solid-state drives (SSD).
+
+force			Forces the driver to mount partitions even if 'dirty' flag
+			(volume dirty) is set. Not recommended for use.
+
+sparse			Create new files as "sparse".
+
+showmeta		Use this parameter to show all meta-files (System Files) on
+			a mounted NTFS partition.
+			By default, all meta-files are hidden.
+
+prealloc		Preallocate space for files excessively when file size is
+			increasing on writes. Decreases fragmentation in case of
+			parallel write operations to different files.
+
+no_acs_rules		"No access rules" mount option sets access rights for
+			files/folders to 777 and owner/group to root. This mount
+			option absorbs all other permissions:
+			- permissions change for files/folders will be reported
+				as successful, but they will remain 777;
+			- owner/group change will be reported as successful, but
+				they will stay as root
+
+acl			Support POSIX ACLs (Access Control Lists). Effective if
+			supported by Kernel. Not to be confused with NTFS ACLs.
+			The option specified as acl enables support for POSIX ACLs.
+
+noatime			All files and directories will not update their last access
+			time attribute if a partition is mounted with this parameter.
+			This option can speed up file system operation.
+
+===============================================================================
+
+ToDo list
+=========
+
+- Full journaling support (currently journal replaying is supported) over JBD.
+
+
+References
+==========
+https://www.paragon-software.com/home/ntfs-linux-professional/
+	- Commercial version of the NTFS driver for Linux.
+
+almaz.alexandrovich@paragon-software.com
+	- Direct e-mail address for feedback and requests on the NTFS3 implementation.
+
diff --git a/fs/ntfs3/Kconfig b/fs/ntfs3/Kconfig
new file mode 100644
index 000000000000..4dde88e79f31
--- /dev/null
+++ b/fs/ntfs3/Kconfig
@@ -0,0 +1,45 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config NTFS3_FS
+	tristate "NTFS Read-Write file system support"
+	select NLS
+	help
+	  Windows OS native file system (NTFS) support up to NTFS version 3.1.
+
+	  Y or M enables the NTFS3 driver with full features enabled (read,
+	  write, journal replaying, sparse/compressed files support).
+	  File system type to use on mount is "ntfs3". Module name (M option)
+	  is also "ntfs3".
+
+	  Documentation: <file:Documentation/filesystems/ntfs3.rst>
+
+config NTFS3_64BIT_CLUSTER
+	bool "64 bits per NTFS clusters"
+	depends on NTFS3_FS && 64BIT
+	help
+	  Windows implementation of ntfs.sys uses 32 bits per clusters.
+	  If activated 64 bits per clusters you will be able to use 4k cluster
+	  for 16T+ volumes. Windows will not be able to mount such volumes.
+
+	  It is recommended to say N here.
+
+config NTFS3_LZX_XPRESS
+	bool "activate support of external compressions lzx/xpress"
+	depends on NTFS3_FS
+	help
+	  In Windows 10 one can use command "compact" to compress any files.
+	  4 possible variants of compression are: xpress4k, xpress8k, xpress16k and lzx.
+
+	  It is recommended to say Y here.
+
+config NTFS3_FS_POSIX_ACL
+	bool "NTFS POSIX Access Control Lists"
+	depends on NTFS3_FS
+	select FS_POSIX_ACL
+	help
+	  POSIX Access Control Lists (ACLs) support additional access rights
+	  for users and groups beyond the standard owner/group/world scheme,
+	  and this option selects support for ACLs specifically for ntfs
+	  filesystems.
+	  NOTE: this is linux only feature. Windows will ignore these ACLs.
+
+	  If you don't know what Access Control Lists are, say N
diff --git a/fs/ntfs3/Makefile b/fs/ntfs3/Makefile
new file mode 100644
index 000000000000..b9aacc061781
--- /dev/null
+++ b/fs/ntfs3/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the ntfs3 filesystem support.
+#
+
+obj-$(CONFIG_NTFS3_FS) += ntfs3.o
+
+ntfs3-y :=	attrib.o \
+		attrlist.o \
+		bitfunc.o \
+		bitmap.o \
+		dir.o \
+		fsntfs.o \
+		frecord.o \
+		file.o \
+		fslog.o \
+		inode.o \
+		index.o \
+		lznt.o \
+		namei.o \
+		record.o \
+		run.o \
+		super.o \
+		upcase.o \
+		xattr.o
+
+ntfs3-$(CONFIG_NTFS3_LZX_XPRESS) += $(addprefix lib/,\
+		decompress_common.o \
+		lzx_decompress.o \
+		xpress_decompress.o \
+		)
\ No newline at end of file

From patchwork Thu Jan 28 09:04:54 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053221
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A88E1C433E6
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:09:40 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 5F7F564DD9
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:09:40 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232201AbhA1JJ0 (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:09:26 -0500
Received: from relaydlg-01.paragon-software.com ([81.5.88.159]:37391 "EHLO
        relaydlg-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231990AbhA1JHA (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:07:00 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relaydlg-01.paragon-software.com (Postfix) with ESMTPS id
 4CC9D8230C;
        Thu, 28 Jan 2021 12:05:05 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824705;
        bh=ryC1O+KjPCuBYVk17xa91BOxLPmINLD4FPXxdPXSapU=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=ILdck63Ony/zLqnaBDlV86ZGKau027a6Q1Q+CC+YZ8bK8XGxNCrhyCGBlRcfnmffM
         dQLL0mr29ISZdOeALbga4pnIL4Tvc/JpY8pf0UHf7YJ2IJQrBTfzBseUoWowePUT6G
         5P8GYMBpCldZFieFZJealg1OPXhgXN+JU3jE1+qg=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:04 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 09/10] fs/ntfs3: Add NTFS3 in fs/Kconfig and fs/Makefile
Date: Thu, 28 Jan 2021 12:04:54 +0300
Message-ID: 
 <20210128090455.3576502-10-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds NTFS3 in fs/Kconfig and fs/Makefile

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 fs/Kconfig  | 1 +
 fs/Makefile | 1 +
 2 files changed, 2 insertions(+)

diff --git a/fs/Kconfig b/fs/Kconfig
index aa4c12282301..eae96d55ab67 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -145,6 +145,7 @@ menu "DOS/FAT/EXFAT/NT Filesystems"
 source "fs/fat/Kconfig"
 source "fs/exfat/Kconfig"
 source "fs/ntfs/Kconfig"
+source "fs/ntfs3/Kconfig"
 
 endmenu
 endif # BLOCK
diff --git a/fs/Makefile b/fs/Makefile
index 999d1a23f036..4f5242cdaee2 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -100,6 +100,7 @@ obj-$(CONFIG_SYSV_FS)		+= sysv/
 obj-$(CONFIG_CIFS)		+= cifs/
 obj-$(CONFIG_HPFS_FS)		+= hpfs/
 obj-$(CONFIG_NTFS_FS)		+= ntfs/
+obj-$(CONFIG_NTFS3_FS)		+= ntfs3/
 obj-$(CONFIG_UFS_FS)		+= ufs/
 obj-$(CONFIG_EFS_FS)		+= efs/
 obj-$(CONFIG_JFFS2_FS)		+= jffs2/

From patchwork Thu Jan 28 09:04:55 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Konstantin Komarov
 <almaz.alexandrovich@paragon-software.com>
X-Patchwork-Id: 12053217
Return-Path: <linux-fsdevel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-23.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,
	URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id EFCABC433DB
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:57 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 9AEE164DD9
	for <linux-fsdevel@archiver.kernel.org>;
 Thu, 28 Jan 2021 09:08:57 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232210AbhA1JIi (ORCPT
        <rfc822;linux-fsdevel@archiver.kernel.org>);
        Thu, 28 Jan 2021 04:08:38 -0500
Received: from relayfre-01.paragon-software.com ([176.12.100.13]:47910 "EHLO
        relayfre-01.paragon-software.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S231561AbhA1JGY (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Thu, 28 Jan 2021 04:06:24 -0500
Received: from dlg2.mail.paragon-software.com
 (vdlg-exch-02.paragon-software.com [172.30.1.105])
        by relayfre-01.paragon-software.com (Postfix) with ESMTPS id
 2781E1F89;
        Thu, 28 Jan 2021 12:05:05 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=paragon-software.com; s=mail; t=1611824705;
        bh=KLNyrps2qmGeDqx5R4mR8X8dmMbswrU34yUsJwY7weI=;
        h=From:To:CC:Subject:Date:In-Reply-To:References;
        b=A1p3uVrCGngcrPmgvVnR68AoTmkMBYfaNRfdMkNjd+it8BODL+CPsgxwswRPEOZGu
         UP7/D1df+GQnPY1HjDDHI6ayObgu9aSikoTc3ZDDaTRZW/m9lRDLrElHoUMGsPITE4
         uS+aOCazVsp9aLTO/gpO4bslnTCJRQczuRddmU2s=
Received: from fsd-lkpg.ufsd.paragon-software.com (172.30.114.105) by
 vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1847.3; Thu, 28 Jan 2021 12:05:04 +0300
From: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
To: <linux-fsdevel@vger.kernel.org>
CC: <viro@zeniv.linux.org.uk>, <linux-kernel@vger.kernel.org>,
        <pali@kernel.org>, <dsterba@suse.cz>, <aaptel@suse.com>,
        <willy@infradead.org>, <rdunlap@infradead.org>, <joe@perches.com>,
        <mark@harmstone.com>, <nborisov@suse.com>,
        <linux-ntfs-dev@lists.sourceforge.net>, <anton@tuxera.com>,
        <dan.carpenter@oracle.com>, <hch@lst.de>, <ebiggers@kernel.org>,
        <andy.lavr@gmail.com>,
        Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Subject: [PATCH v19 10/10] fs/ntfs3: Add MAINTAINERS
Date: Thu, 28 Jan 2021 12:04:55 +0300
Message-ID: 
 <20210128090455.3576502-11-almaz.alexandrovich@paragon-software.com>
X-Mailer: git-send-email 2.25.4
In-Reply-To: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
References: 
 <20210128090455.3576502-1-almaz.alexandrovich@paragon-software.com>
MIME-Version: 1.0
X-Originating-IP: [172.30.114.105]
X-ClientProxiedBy: vdlg-exch-02.paragon-software.com (172.30.1.105) To
 vdlg-exch-02.paragon-software.com (172.30.1.105)
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org

This adds MAINTAINERS

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 992fe3b0900a..33fc38d7bb2f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12668,6 +12668,13 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs.git
 F:	Documentation/filesystems/ntfs.rst
 F:	fs/ntfs/
 
+NTFS3 FILESYSTEM
+M:	Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
+S:	Supported
+W:	http://www.paragon-software.com/
+F:	Documentation/filesystems/ntfs3.rst
+F:	fs/ntfs3/
+
 NUBUS SUBSYSTEM
 M:	Finn Thain <fthain@telegraphics.com.au>
 L:	linux-m68k@lists.linux-m68k.org