From patchwork Fri Apr 6 11:41:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sayan Ghosh X-Patchwork-Id: 10325887 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 096176053F for ; Fri, 6 Apr 2018 11:42:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE1B429274 for ; Fri, 6 Apr 2018 11:42:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E241A292CF; Fri, 6 Apr 2018 11:42:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3056029274 for ; Fri, 6 Apr 2018 11:42:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752272AbeDFLme (ORCPT ); Fri, 6 Apr 2018 07:42:34 -0400 Received: from mail-io0-f195.google.com ([209.85.223.195]:40424 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751540AbeDFLmc (ORCPT ); Fri, 6 Apr 2018 07:42:32 -0400 Received: by mail-io0-f195.google.com with SMTP id e79so1372841ioi.7; Fri, 06 Apr 2018 04:42:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc :content-transfer-encoding; bh=/mds4ir7rgndp/mtwwfuDUfmIo4R3yzkEULLWAtX/P0=; b=lsEhFaPCFRtAlw1T0Xj+OlFasNKkzLRlLqL6JTUL12RBLFOBVFqgv/70DKer5xxTjA WNWikcfkTWJcd1jMjePDnIs+Movgl3X7q2EpzFs9U8PGpixjc4dB4zNc97B0JStMEeen /qmARO32bbA7Y9SuH/ea8hgCFkDsmo7AC9/ncripMA4Z/2b53t1CR4a/9wKXOkeHb/+K Th2z6U/mhfmKwxQklIPyc19MIbjPaYG4sqlQngK6zrQP0Ary9pC8S3y7KtsYpX9TWCwh 7lnw58rhKJ/DVc+wpEmgReveuxWsOtCzwBmKLsupo5yLBMiwt18xpI01mIFMJAHIstCS SznQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc :content-transfer-encoding; bh=/mds4ir7rgndp/mtwwfuDUfmIo4R3yzkEULLWAtX/P0=; b=H0X35l8afUS3WBb/CSoG3KxuGErwj0beaWmU49KtZPkrG4eSAIlK1kOOevXtmcqq+l vHrCBvAkN0tduWJdGGD4WURIC16+TWyEKLyEK45E29lQdAHsH5mMSF5/4YQHuNSsTT73 knKIz5r+iO187smSOzC5B6wCEgJlN/5GUUQi/4YvkLcvEVU+1ap2WU0xkDA7CHgWhvV0 yveAVcIN0qzRClREQVKsuObEMEdOMWVNy28l1JcgycU81xXtPWirDht2WKye1QeBSgtU PdG8Hv1laHI7wElgtKeRX6k1S1fXCxl4dCJlyMgHgrdGL0x4zCly2glG+F0YMlVCBdYW AtGw== X-Gm-Message-State: AElRT7GA1dVJUooeOgHNhjpTyVRP4g3YnBen8RY6kmmInfjHEppdyqAt wslcYe2QXgoBvDD2EaasOy9a9PEEbHH7GtvtEc/GcQ== X-Google-Smtp-Source: AIpwx4844h0EsfhE4BvS30DSARW0l1jWsE2n7sASFCsbbby2pzjL4+AholF+HVKlAyzvBBF/JYOk7T9n7qaVkrEEq6g= X-Received: by 10.107.16.73 with SMTP id y70mr24100738ioi.202.1523014951655; Fri, 06 Apr 2018 04:42:31 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.1.165 with HTTP; Fri, 6 Apr 2018 04:41:51 -0700 (PDT) From: Sayan Ghosh Date: Fri, 6 Apr 2018 17:11:51 +0530 Message-ID: Subject: [Patch 2/4] Controlling block allocation of a file with respect to grading information To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Bhattacharya, Suparna" , niloy ganguly , Madhumita Mallick , "Bharde, Madhumita" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Grades are being read from the extended attribute while preallocating the blocks for a single graded file. We assume binary grading of the file blocks, and high graded blocks to be placed in the persistent memory region of the LVM while the lower graded ones to be placed in the HDD portion of the LVM. Here we alter the block allocation method in the functions ext4_ext_map_blocks() and ext4_alloc_file_blocks(). Leveraging the existing goal-block allocation to get goals in different tiers according to the grades has yet not been done. Consider the LVM is segmented as, --- Segments --- Logical extents 0 to 1219: Type linear Physical volume /dev/sda11 Physical extents 0 to 1219 Logical extents 1220 to 1474: Type linear Physical volume /dev/pmem0 Physical extents 0 to 254 We hard code the ‘LOW_GRADE_STARTING_BLOCK’ as 0*1024, and HIGH_GRADE_STARTING_BLOCK as 1220*1024 for the initial logical block number of the respective tiers. FIX_ME comments have been provided in suitable positions. The patch is on top of Linux Kernel 4.7.2. Signed-off-by: Sayan Ghosh --- fs/ext4/ext4.h | 1 + fs/ext4/extents.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 114 insertions(+), 3 deletions(-) void read_grade_xattr(struct inode *inode,struct grade_struct *grade_array) @@ -92,6 +98,43 @@ unsigned long long read_count_xattr(struct inode *inode) return total; } +/* + * find_grade() is to find the grade of a logical block. + * This also returns the length of graded or ungraded portion + * starting from that logical block number (gets stored in the variable + * req_len). The return value is 1 for high grade and 0 otherwise. + */ +int find_grade(struct grade_struct* grade_array, unsigned long long total, ext4_fsblk_t val, unsigned long long *req_len) +{ + if (val >= (grade_array[total -1].block_num + grade_array[total -1].len) ){ + if (req_len != NULL) + (*req_len) = 0; + return 0; + } + unsigned long long beg, end, mid; + beg = 0; + end = total-1; + while (beg <= end){ + mid = (beg + end)/2; + if ((val >= grade_array[mid].block_num) && (val <= (grade_array[mid].block_num + grade_array[mid].len - 1)) ){ + if (req_len != NULL) + (*req_len) = grade_array[mid].len; + return 1; + } + if(beg == end) + break; + if (grade_array[mid].block_num > val){ + end = (mid > 0) ? (mid - 1) : 0; + } + else{ + beg = mid + 1; + } + } + if (req_len != NULL) + (*req_len) = grade_array[mid].block_num - val; + return 0; +} + static __le32 ext4_extent_block_csum(struct inode *inode, struct ext4_extent_header *eh) { @@ -4326,6 +4369,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, struct ext4_extent newex, *ex, *ex2; struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); ext4_fsblk_t newblock = 0; + int free_on_err = 0, err = 0, depth, ret; unsigned int allocated = 0, offset = 0; unsigned int allocated_clusters = 0; @@ -4333,6 +4377,14 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, ext4_lblk_t cluster_offset; bool map_from_cluster = false; + struct grade_struct *grade_array = NULL; + unsigned long long total; + if (is_file_graded(inode)){ + total = read_count_xattr(inode); + grade_array = (struct grade_struct *)kmalloc(total*sizeof(struct grade_struct), GFP_USER); + read_grade_xattr(inode,grade_array); + } + ext_debug("blocks %u/%u requested for inode %lu\n", map->m_lblk, map->m_len, inode->i_ino); trace_ext4_ext_map_blocks_enter(inode, map->m_lblk, map->m_len, flags); @@ -4494,8 +4546,36 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, /* allocate new block */ ar.inode = inode; - ar.goal = ext4_ext_find_goal(inode, path, map->m_lblk); + if(!is_file_graded(inode)){ + ar.goal = ext4_ext_find_goal(inode, path, map->m_lblk); + } + + /* + * ** FIX ME ** + * Now accessing different goals for different tiers is hard coded. + * Please suggest a method to maintain multiple goal states in different tiers, + * each corresponding to the respective grades for proper goal-block placement. + * + * ** TODO 1 ** + * Instead of hard-coding LOW_GRADE_STARTING_BLOCK and HIGH_GRADE_STARTING_BLOCK + * set their values automatically from the LVM (see the description). + * + * ** TODO 2 ** + * It is assumed that higher grade storage area will not overflow. + * We need to take care of the case when high grade storage device gets full + * and data has to be stored in the lower tier. + */ + else{ + unsigned long long temp; + if(find_grade(grade_array,total,map->m_lblk,temp) == 0){ + ar.goal = LOW_GRADE_STARTING_BLOCK; + } + if(find_grade(grade_array,total,map->m_lblk,temp) == 1){ + ar.goal = HIGH_GRADE_STARTING_BLOCK; + } + } ar.logical = map->m_lblk; + /* * We calculate the offset from the beginning of the cluster * for the logical block number, since when we allocate a @@ -4519,7 +4599,12 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, ar.flags |= EXT4_MB_DELALLOC_RESERVED; if (flags & EXT4_GET_BLOCKS_METADATA_NOFAIL) ar.flags |= EXT4_MB_USE_RESERVED; + if(is_file_graded(inode)){ + ar.flags |= EXT4_MB_HINT_NOPREALLOC; + } newblock = ext4_mb_new_blocks(handle, &ar, &err); + +go_out: if (!newblock) goto out2; ext_debug("allocate new block: goal %llu, found %llu/%u\n", @@ -4706,6 +4791,8 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, { struct inode *inode = file_inode(file); handle_t *handle; + + int grade_val = 0; int ret = 0; int ret2 = 0; int retries = 0; @@ -4713,9 +4800,17 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, struct ext4_map_blocks map; unsigned int credits; loff_t epos; - map.m_lblk = offset; map.m_len = len; + + struct grade_struct *grade_array = NULL; + unsigned long long total; + if (is_file_graded(inode)){ + total = read_count_xattr(inode); + grade_array = (struct grade_struct *)kmalloc(total*sizeof(struct grade_struct), GFP_USER); + read_grade_xattr(inode,grade_array); + } + /* * Don't normalize the request if it can fit in one extent so * that it doesn't get unnecessarily split into multiple @@ -4735,10 +4830,23 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, depth = ext_depth(inode); else depth = -1; - retry: while (ret >= 0 && len) { /* + * Finding length of blocks which have same grade + * and they are preallocated together. + */ + if (is_file_graded(inode)){ + map.m_len = 1; + unsigned long long req_len; + grade_val = find_grade(grade_array,total,map.m_lblk,&req_len); + if (req_len == 0) + map.m_len = len; + else + map.m_len = req_len; + } + + /* * Recalculate credits when extent tree depth changes. */ if (depth >= 0 && depth != ext_depth(inode)) { @@ -4753,6 +4861,7 @@ retry: break; } ret = ext4_map_blocks(handle, inode, &map, flags); + if (ret <= 0) { ext4_debug("inode #%lu: block %u: len %u: " "ext4_ext_map_blocks returned %d", @@ -4762,6 +4871,7 @@ retry: ret2 = ext4_journal_stop(handle); break; } + map.m_lblk += ret; map.m_len = len = len - ret; epos = (loff_t)map.m_lblk << inode->i_blkbits; ‌ diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index b9ec0ca..c7d2eed 100755 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3201,6 +3201,7 @@ struct ext4_extent; extern unsigned long long read_count_xattr(struct inode *inode); extern void read_grade_xattr(struct inode *inode,struct grade_struct *grade_array); extern int is_file_graded(struct inode *inode); +extern int find_grade(struct grade_struct* grade_array, unsigned long long total, ext4_fsblk_t val, unsigned long long *req_len); /* * Maximum number of logical blocks in a file; ext4_extent's ee_block is diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index de9194f..aaff3a3 100755 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -58,6 +58,12 @@ #define EXT4_EXT_DATA_VALID2 0x10 /* second half contains valid data */ /* + * Starting block numbers for low and high grades + */ +#define LOW_GRADE_STARTING_BLOCK 0 +#define HIGH_GRADE_STARTING_BLOCK 1249280 + +/* * read_grade_xattr() is used to read the grade array from the extended attribute. */