From patchwork Sat Jul 29 13:36:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timofey Titovets X-Patchwork-Id: 9869951 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A2B6260382 for ; Sat, 29 Jul 2017 13:37:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 980381FF1F for ; Sat, 29 Jul 2017 13:37:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8CFAA28885; Sat, 29 Jul 2017 13:37:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EF1FF1FF1F for ; Sat, 29 Jul 2017 13:37:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753826AbdG2NhQ (ORCPT ); Sat, 29 Jul 2017 09:37:16 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:33321 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753720AbdG2NhJ (ORCPT ); Sat, 29 Jul 2017 09:37:09 -0400 Received: by mail-wm0-f68.google.com with SMTP id q189so19313651wmd.0 for ; Sat, 29 Jul 2017 06:37:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=v/iCGfJZWuZSnZJ72GUoFRSziOfaoCkVv7RSkg4x4Oc=; b=LtKDwMNyKQKeJGR814oSA4nrOkic/coNJ232MeFYCGDK401BFz57Grw8mM/ADUA32M zO8LNzY7ETrD9UPZC3K52vDVoPPIKlGWIiyfer0p+B4G7jx79uRQxVC3CCzZ4qqFv6Ur oAbrVW9gYZPoZ7uaOywE0eGCf11gGP0ZbIUuW3XSZJVbO4uARZ3UqF18sz65ZWk+yHO0 tdHlw6MAR7bsGoflpDWHnTdgpJzXxmQ51MB9peVqGQzuUuXMtm020emfa1m48A1+x9bP R74sbvkgEuV3jPev010c8bTbulue8EE4QXZwieukVkif2XOmmTPeg0dMjUsP5FufbCgx FP+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=v/iCGfJZWuZSnZJ72GUoFRSziOfaoCkVv7RSkg4x4Oc=; b=Ml8h2CW66wgEq+hBOFErEiIlNFHTaxVS38vZA9VrPR74QB312FLHrUDA3Lug733u2h kEYIH2+KV2FFT/+1/S/Ir4BzQVdSG9oXolxh/3jtZ10GwjpVheFAedRJ5ctMXuzlGvkN UU1nnvUhIbWVRudcB1od+2U1u/QzsQlNAX2dQsLr+POQ1EcZm94A1fSFVq/9/NXnaKAa oAulbqqpTKNy0YUHmEk2QJAPAOS90ieEgDwtIhMhgiLhm+epGZkIpbaFmpiHMS13amq9 beZHw/Vl7YUcv3bzycCVMQM+gOHpHE8Cj9wASbySAFxAw+2SUjzBP+/3BliHuODNt+Ni mAxg== X-Gm-Message-State: AIVw111WjOkG6BkJi74aXI8MOVRxxrOuc+o730gOlD8vZOI4S5a0CWLu VmjlISdoZXpWOzbm X-Received: by 10.28.196.71 with SMTP id u68mr8057020wmf.18.1501335428313; Sat, 29 Jul 2017 06:37:08 -0700 (PDT) Received: from localhost.localdomain (nat3-minsk-pool-46-53-180-190.telecom.by. [46.53.180.190]) by smtp.gmail.com with ESMTPSA id q18sm16306857wmd.44.2017.07.29.06.37.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Jul 2017 06:37:07 -0700 (PDT) From: Timofey Titovets To: linux-btrfs@vger.kernel.org Cc: Timofey Titovets Subject: [PATCH v3 3/3] Btrfs: heuristic add byte core set calculation Date: Sat, 29 Jul 2017 16:36:55 +0300 Message-Id: <20170729133655.31260-4-nefelim4ag@gmail.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20170729133655.31260-1-nefelim4ag@gmail.com> References: <20170729133655.31260-1-nefelim4ag@gmail.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Calculate byte core set for data sample: Sort bucket's numbers in decreasing order Count how many numbers use 90% of sample If core set are low (<=25%), data are easily compressible If core set high (>=80%), data are not compressible Signed-off-by: Timofey Titovets --- fs/btrfs/compression.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/compression.h | 2 ++ 2 files changed, 60 insertions(+) -- 2.13.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 1429b11f2c5f..a469a7c21f5a 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -33,6 +33,7 @@ #include #include #include +#include #include "ctree.h" #include "disk-io.h" #include "transaction.h" @@ -1069,6 +1070,42 @@ static inline int byte_set_size(const struct heuristic_bucket_item *bucket) return byte_set_size; } +/* For bucket sorting */ +static inline int heuristic_bucket_compare(const void *lv, const void *rv) +{ + struct heuristic_bucket_item *l = (struct heuristic_bucket_item *)(lv); + struct heuristic_bucket_item *r = (struct heuristic_bucket_item *)(rv); + + return r->count - l->count; +} + +/* + * Byte Core set size + * How many bytes use 90% of sample + */ +static inline int byte_core_set_size(struct heuristic_bucket_item *bucket, + u32 core_set_threshold) +{ + int a = 0; + u32 coreset_sum = 0; + + for (; a < BTRFS_HEURISTIC_BYTE_CORE_SET_LOW; a++) + coreset_sum += bucket[a].count; + + if (coreset_sum > core_set_threshold) + return a; + + for (; a < BTRFS_HEURISTIC_BYTE_CORE_SET_HIGH; a++) { + if (bucket[a].count == 0) + break; + coreset_sum += bucket[a].count; + if (coreset_sum > core_set_threshold) + break; + } + + return a; +} + /* * Compression heuristic. * @@ -1092,6 +1129,8 @@ int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end) struct heuristic_bucket_item *bucket; int a, b, ret; u8 symbol, *input_data; + u32 core_set_threshold; + u32 input_size = end - start; ret = 1; @@ -1123,6 +1162,25 @@ int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end) goto out; } + /* Sort in reverse order */ + sort(bucket, BTRFS_HEURISTIC_BUCKET_SIZE, + sizeof(struct heuristic_bucket_item), &heuristic_bucket_compare, + NULL); + + core_set_threshold = (input_size*90)/(BTRFS_HEURISTIC_ITER_OFFSET*100); + core_set_threshold *= BTRFS_HEURISTIC_READ_SIZE; + + a = byte_core_set_size(bucket, core_set_threshold); + if (a <= BTRFS_HEURISTIC_BYTE_CORE_SET_LOW) { + ret = 2; + goto out; + } + + if (a >= BTRFS_HEURISTIC_BYTE_CORE_SET_HIGH) { + ret = 0; + goto out; + } + out: kfree(bucket); return ret; diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index 03857967815a..0fcd1a485adb 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -139,6 +139,8 @@ struct heuristic_bucket_item { #define BTRFS_HEURISTIC_ITER_OFFSET 256 #define BTRFS_HEURISTIC_BUCKET_SIZE 256 #define BTRFS_HEURISTIC_BYTE_SET_THRESHOLD 64 +#define BTRFS_HEURISTIC_BYTE_CORE_SET_LOW BTRFS_HEURISTIC_BYTE_SET_THRESHOLD +#define BTRFS_HEURISTIC_BYTE_CORE_SET_HIGH 200 // 80% int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end);