From patchwork Mon Jun 8 22:03:05 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kiyoshi Ueda X-Patchwork-Id: 28763 X-Patchwork-Delegate: agk@redhat.com Received: from hormel.redhat.com (hormel1.redhat.com [209.132.177.33]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n58M3Bxj001024 for ; Mon, 8 Jun 2009 22:03:11 GMT Received: from listman.util.phx.redhat.com (listman.util.phx.redhat.com [10.8.4.110]) by hormel.redhat.com (Postfix) with ESMTP id 23E008E007E; Mon, 8 Jun 2009 18:03:10 -0400 (EDT) Received: from int-mx2.corp.redhat.com ([172.16.27.26]) by listman.util.phx.redhat.com (8.13.1/8.13.1) with ESMTP id n58M38HO016696 for ; Mon, 8 Jun 2009 18:03:08 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n58M37dP011779 for ; Mon, 8 Jun 2009 18:03:07 -0400 Received: from agk-dp.fab.redhat.com (agk-dp.fab.redhat.com [10.33.0.20]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n58M3672025666 for ; Mon, 8 Jun 2009 18:03:06 -0400 Received: from agk by agk-dp.fab.redhat.com with local (Exim 4.69) (envelope-from ) id 1MDmvp-0002di-OI for dm-devel@redhat.com; Mon, 08 Jun 2009 23:03:05 +0100 Date: Mon, 8 Jun 2009 23:03:05 +0100 From: Kiyoshi Ueda via agk To: dm-devel@redhat.com Message-ID: <20090608220305.GB647@agk-dp.fab.redhat.com> Mail-Followup-To: dm-devel@redhat.com MIME-Version: 1.0 Content-Disposition: inline Organization: Red Hat UK Ltd. Registered in England and Wales, number 03798903. Registered Office: Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE. User-Agent: Mutt/1.5.18 (2008-05-17) X-Scanned-By: MIMEDefang 2.58 on 172.16.27.26 X-loop: dm-devel@redhat.com Subject: [dm-devel] rqdm-dlb-04-service-time-dlb-add-perf-limit.patch X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.5 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com o Limited the second table argument (relative throughput value) in 0-100. As a result, no need to use 'size_t' for ->perf. Use 'unsigned'. Updated comments/documents. o Converted the service time calculation method to multiplication from division. Signed-off-by: Kiyoshi Ueda Signed-off-by: Jun'ichi Nomura --- Documentation/device-mapper/dm-service-time.txt | 1 drivers/md/dm-service-time.c | 57 +++++++++++++++--------- 2 files changed, 37 insertions(+), 21 deletions(-) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel Index: 2.6.30-rc5/drivers/md/dm-service-time.c =================================================================== --- 2.6.30-rc5.orig/drivers/md/dm-service-time.c +++ 2.6.30-rc5/drivers/md/dm-service-time.c @@ -13,7 +13,10 @@ #define DM_MSG_PREFIX "multipath service-time" #define ST_MIN_IO 1 -#define ST_VERSION "0.1.0" +#define ST_MAX_PERF 100 +#define ST_MAX_PERF_SHIFT 7 +#define ST_MAX_INFLIGHT_SIZE ((size_t)-1 >> ST_MAX_PERF_SHIFT) +#define ST_VERSION "0.2.0" struct selector { struct list_head valid_paths; @@ -24,7 +27,7 @@ struct path_info { struct list_head list; struct dm_path *path; unsigned repeat_count; - size_t perf; + unsigned perf; atomic_t in_flight_size; /* Total size of in-flight I/Os */ }; @@ -84,12 +87,11 @@ static int st_status(struct path_selecto switch (type) { case STATUSTYPE_INFO: - DMEMIT("%d %llu ", atomic_read(&pi->in_flight_size), - (unsigned long long)pi->perf); + DMEMIT("%d %u ", atomic_read(&pi->in_flight_size), + pi->perf); break; case STATUSTYPE_TABLE: - DMEMIT("%u %llu ", pi->repeat_count, - (unsigned long long)pi->perf); + DMEMIT("%u %u ", pi->repeat_count, pi->perf); break; } } @@ -103,7 +105,7 @@ static int st_add_path(struct path_selec struct selector *s = ps->context; struct path_info *pi; unsigned repeat_count = ST_MIN_IO; - unsigned long long tmpll = 1; + unsigned perf = 1; /* * Arguments: [ []] @@ -111,6 +113,7 @@ static int st_add_path(struct path_selec * If not given, default (ST_MIN_IO) is used. * : The relative throughput value of the path * among all paths in the path-group. + * The valid range: 0- * If not given, minimum value '1' is used. * If '0' is given, the path isn't selected while * other paths having a positive value are @@ -126,7 +129,8 @@ static int st_add_path(struct path_selec return -EINVAL; } - if ((argc == 2) && (sscanf(argv[1], "%llu", &tmpll) != 1)) { + if ((argc == 2) && + (sscanf(argv[1], "%u", &perf) != 1 || perf > ST_MAX_PERF)) { *error = "service-time ps: invalid performance value"; return -EINVAL; } @@ -140,7 +144,7 @@ static int st_add_path(struct path_selec pi->path = path; pi->repeat_count = repeat_count; - pi->perf = tmpll; + pi->perf = perf; atomic_set(&pi->in_flight_size, 0); path->pscontext = pi; @@ -186,7 +190,7 @@ static int st_reinstate_path(struct path static int st_compare_load(struct path_info *pi1, struct path_info *pi2, size_t incoming) { - size_t sz1, sz2; + size_t sz1, sz2, st1, st2; sz1 = atomic_read(&pi1->in_flight_size); sz2 = atomic_read(&pi2->in_flight_size); @@ -206,21 +210,32 @@ static int st_compare_load(struct path_i /* * Case 3: Calculate service time. Choose faster path. - * if ((sz1+incoming)/pi1->perf < (sz2+incoming)/pi2->perf) pi1 - * if ((sz1+incoming)/pi1->perf > (sz2+incoming)/pi2->perf) pi2 + * Service time using pi1: st1 = (sz1 + incoming) / pi1->perf + * Service time using pi2: st2 = (sz2 + incoming) / pi2->perf + * + * To avoid the division, transform the expression to use + * multiplication. + * Because ->perf > 0 here, if st1 < st2, the expressions + * below are the same meaning: + * (sz1 + incoming) / pi1->perf < (sz2 + incoming) / pi2->perf + * (sz1 + incoming) * pi2->perf < (sz2 + incoming) * pi1->perf + * So use the later one. */ sz1 += incoming; sz2 += incoming; - while (sz1 && sz2 && (sz1 < pi1->perf) && (sz2 < pi2->perf)) { - /* Size is not big enough to compare by division. Shift up */ - sz1 <<= 2; - sz2 <<= 2; + if (unlikely(sz1 >= ST_MAX_INFLIGHT_SIZE || + sz2 >= ST_MAX_INFLIGHT_SIZE)) { + /* + * Size may be too big for multiplying pi->perf and overflow. + * To avoid the overflow and mis-selection, shift down both. + */ + sz1 >>= ST_MAX_PERF_SHIFT; + sz2 >>= ST_MAX_PERF_SHIFT; } - do_div(sz1, pi1->perf); - do_div(sz2, pi2->perf); - - if (sz1 != sz2) - return sz1 - sz2; + st1 = sz1 * pi2->perf; + st2 = sz2 * pi1->perf; + if (st1 != st2) + return st1 - st2; /* * Case 4: Service time is equal. Choose higher performance path. Index: 2.6.30-rc5/Documentation/device-mapper/dm-service-time.txt =================================================================== --- 2.6.30-rc5.orig/Documentation/device-mapper/dm-service-time.txt +++ 2.6.30-rc5/Documentation/device-mapper/dm-service-time.txt @@ -19,6 +19,7 @@ Table parameters for each path: [: The relative throughput value of the path among all paths in the path-group. + The valid range is 0-100. If not given, minimum value '1' is used. If '0' is given, the path isn't selected while other paths having a positive value are available.