diff mbox

[v4,2/2] ndctl: add list --media-errors support

Message ID 149446059426.19998.8536490603388756669.stgit@djiang5-desk3.ch.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Jiang May 10, 2017, 11:57 p.m. UTC
ACPI NFIT enabled platforms provide media errors as absolute phyiscal
address offsets. Add an option to ndctl to display those media errors
tranlsated to region and namespace device level offsets in an "ndctl
list" listing. BTT badblocks show is not supported in this iteration and
will come later.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---

v2: added fix to badblocks display from Toshi's testing result.
v3: fixed naming issues from Dan's comments.
    fixed badblocks boundary offset calculations from Toshi's testing.
v4: Add indicator to show badblocks exist or not, per Toshi's comments.

 ndctl/list.c      |   25 ++++++-
 ndctl/namespace.c |    2 -
 util/json.c       |  195 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 util/json.h       |    8 ++
 4 files changed, 222 insertions(+), 8 deletions(-)

Comments

Kani, Toshi May 11, 2017, 3:31 p.m. UTC | #1
On Wed, 2017-05-10 at 16:57 -0700, Dave Jiang wrote:
> ACPI NFIT enabled platforms provide media errors as absolute phyiscal

> address offsets. Add an option to ndctl to display those media errors

> tranlsated to region and namespace device level offsets in an "ndctl

> list" listing. BTT badblocks show is not supported in this iteration

> and will come later.

> 

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

> ---

> 

> v2: added fix to badblocks display from Toshi's testing result.

> v3: fixed naming issues from Dan's comments.

>     fixed badblocks boundary offset calculations from Toshi's

> testing.

> v4: Add indicator to show badblocks exist or not, per Toshi's

> comments.


This version works pretty good, except that it does not show
"has_badblocks" for sector mode.

# ndctl list -M -m sector
{
  "dev":"namespace0.0",
  "mode":"sector",
  "size":17162027008,
  "uuid":"7742dff3-3edd-4a28-894d-9ac1aede6b2c",
  "sector_size":4096,
  "blockdev":"pmem0s"
}

Thanks,
-Toshi
Dan Williams May 11, 2017, 3:39 p.m. UTC | #2
On Thu, May 11, 2017 at 8:31 AM, Kani, Toshimitsu <toshi.kani@hpe.com> wrote:
> On Wed, 2017-05-10 at 16:57 -0700, Dave Jiang wrote:
>> ACPI NFIT enabled platforms provide media errors as absolute phyiscal
>> address offsets. Add an option to ndctl to display those media errors
>> tranlsated to region and namespace device level offsets in an "ndctl
>> list" listing. BTT badblocks show is not supported in this iteration
>> and will come later.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>
>> v2: added fix to badblocks display from Toshi's testing result.
>> v3: fixed naming issues from Dan's comments.
>>     fixed badblocks boundary offset calculations from Toshi's
>> testing.
>> v4: Add indicator to show badblocks exist or not, per Toshi's
>> comments.
>
> This version works pretty good, except that it does not show
> "has_badblocks" for sector mode.
>
> # ndctl list -M -m sector
> {
>   "dev":"namespace0.0",
>   "mode":"sector",
>   "size":17162027008,
>   "uuid":"7742dff3-3edd-4a28-894d-9ac1aede6b2c",
>   "sector_size":4096,
>   "blockdev":"pmem0s"
> }

Since we're going to spin the patch to fix this up... Toshi, what do
you think about changing "has_badblocks" to "badblock_count". My
concern is that for the btt case it will also count badblocks that are
on the free list or in metadata, but it at least gives you a better
indication of the damage level of the underlying namespace.
Kani, Toshi May 11, 2017, 3:55 p.m. UTC | #3
On Thu, 2017-05-11 at 08:39 -0700, Dan Williams wrote:
> On Thu, May 11, 2017 at 8:31 AM, Kani, Toshimitsu <toshi.kani@hpe.com

> > wrote:

> > On Wed, 2017-05-10 at 16:57 -0700, Dave Jiang wrote:

> > > ACPI NFIT enabled platforms provide media errors as absolute

> > > phyiscal address offsets. Add an option to ndctl to display those

> > > media errors tranlsated to region and namespace device level

> > > offsets in an "ndctl list" listing. BTT badblocks show is not

> > > supported in this iteration and will come later.

> > > 

> > > Signed-off-by: Dave Jiang <dave.jiang@intel.com>

> > > ---

> > > 

> > > v2: added fix to badblocks display from Toshi's testing result.

> > > v3: fixed naming issues from Dan's comments.

> > >     fixed badblocks boundary offset calculations from Toshi's

> > > testing.

> > > v4: Add indicator to show badblocks exist or not, per Toshi's

> > > comments.

> > 

> > This version works pretty good, except that it does not show

> > "has_badblocks" for sector mode.

> > 

> > # ndctl list -M -m sector

> > {

> >   "dev":"namespace0.0",

> >   "mode":"sector",

> >   "size":17162027008,

> >   "uuid":"7742dff3-3edd-4a28-894d-9ac1aede6b2c",

> >   "sector_size":4096,

> >   "blockdev":"pmem0s"

> > }

> 

> Since we're going to spin the patch to fix this up... Toshi, what do

> you think about changing "has_badblocks" to "badblock_count". My

> concern is that for the btt case it will also count badblocks that

> are on the free list or in metadata, but it at least gives you a

> better indication of the damage level of the underlying namespace.


Good idea.  Yes, I like "badblock_count" better.

Thanks!
-Toshi
Dave Jiang May 11, 2017, 4:09 p.m. UTC | #4
On 05/11/2017 08:31 AM, Kani, Toshimitsu wrote:
> On Wed, 2017-05-10 at 16:57 -0700, Dave Jiang wrote:
>> ACPI NFIT enabled platforms provide media errors as absolute phyiscal
>> address offsets. Add an option to ndctl to display those media errors
>> tranlsated to region and namespace device level offsets in an "ndctl
>> list" listing. BTT badblocks show is not supported in this iteration
>> and will come later.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>
>> v2: added fix to badblocks display from Toshi's testing result.
>> v3: fixed naming issues from Dan's comments.
>>     fixed badblocks boundary offset calculations from Toshi's
>> testing.
>> v4: Add indicator to show badblocks exist or not, per Toshi's
>> comments.
> 
> This version works pretty good, except that it does not show
> "has_badblocks" for sector mode.

Yes I intentionally left it out. I was going to add sector mode support
as follow on after Dan accepts these patches as it required additional
support code to read the btt badblocks file etc. If you'd rather
everything in right now I can try to get sector mode support in now.

> 
> # ndctl list -M -m sector
> {
>   "dev":"namespace0.0",
>   "mode":"sector",
>   "size":17162027008,
>   "uuid":"7742dff3-3edd-4a28-894d-9ac1aede6b2c",
>   "sector_size":4096,
>   "blockdev":"pmem0s"
> }
> 
> Thanks,
> -Toshi
>
Dan Williams May 11, 2017, 4:12 p.m. UTC | #5
On Thu, May 11, 2017 at 9:09 AM, Dave Jiang <dave.jiang@intel.com> wrote:
>
>
> On 05/11/2017 08:31 AM, Kani, Toshimitsu wrote:
>> On Wed, 2017-05-10 at 16:57 -0700, Dave Jiang wrote:
>>> ACPI NFIT enabled platforms provide media errors as absolute phyiscal
>>> address offsets. Add an option to ndctl to display those media errors
>>> tranlsated to region and namespace device level offsets in an "ndctl
>>> list" listing. BTT badblocks show is not supported in this iteration
>>> and will come later.
>>>
>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>> ---
>>>
>>> v2: added fix to badblocks display from Toshi's testing result.
>>> v3: fixed naming issues from Dan's comments.
>>>     fixed badblocks boundary offset calculations from Toshi's
>>> testing.
>>> v4: Add indicator to show badblocks exist or not, per Toshi's
>>> comments.
>>
>> This version works pretty good, except that it does not show
>> "has_badblocks" for sector mode.
>
> Yes I intentionally left it out. I was going to add sector mode support
> as follow on after Dan accepts these patches as it required additional
> support code to read the btt badblocks file etc. If you'd rather
> everything in right now I can try to get sector mode support in now.

Sorry, crossed wires. You don't need the btt badblocks file, just
count the badblocks from the region that also fall in the namespace
range.
diff mbox

Patch

diff --git a/ndctl/list.c b/ndctl/list.c
index 536d333..1640ae8 100644
--- a/ndctl/list.c
+++ b/ndctl/list.c
@@ -24,6 +24,7 @@  static struct {
 	bool idle;
 	bool health;
 	bool dax;
+	bool media_errors;
 } list;
 
 static struct {
@@ -99,7 +100,8 @@  static struct json_object *list_namespaces(struct ndctl_region *region,
 						jnamespaces);
 		}
 
-		jndns = util_namespace_to_json(ndns, list.idle, list.dax);
+		jndns = util_namespace_to_json(ndns, list.idle, list.dax,
+				list.media_errors);
 		if (!jndns) {
 			fail("\n");
 			continue;
@@ -117,12 +119,14 @@  static struct json_object *list_namespaces(struct ndctl_region *region,
 	return NULL;
 }
 
-static struct json_object *region_to_json(struct ndctl_region *region)
+static struct json_object *region_to_json(struct ndctl_region *region,
+		bool include_media_err)
 {
 	struct json_object *jregion = json_object_new_object();
-	struct json_object *jobj, *jmappings = NULL;
+	struct json_object *jobj, *jobj2, *jmappings = NULL;
 	struct ndctl_interleave_set *iset;
 	struct ndctl_mapping *mapping;
+	unsigned int bb_count;
 
 	if (!jregion)
 		return NULL;
@@ -203,6 +207,17 @@  static struct json_object *region_to_json(struct ndctl_region *region)
 		json_object_object_add(jregion, "state", jobj);
 	}
 
+	jobj2 = util_region_badblocks_to_json(region, include_media_err,
+			&bb_count);
+	jobj = json_object_new_boolean((bool)!!bb_count);
+	if (!jobj) {
+		json_object_put(jobj2);
+		goto err;
+	}
+	json_object_object_add(jregion, "has_badblocks", jobj);
+	if (include_media_err && jobj2)
+		json_object_object_add(jregion, "badblocks", jobj2);
+
 	list_namespaces(region, jregion, NULL, false);
 	return jregion;
  err:
@@ -240,6 +255,8 @@  int cmd_list(int argc, const char **argv, void *ctx)
 		OPT_BOOLEAN('X', "device-dax", &list.dax,
 				"include device-dax info"),
 		OPT_BOOLEAN('i', "idle", &list.idle, "include idle devices"),
+		OPT_BOOLEAN('M', "media-errors", &list.media_errors,
+				"include media errors"),
 		OPT_END(),
 	};
 	const char * const u[] = {
@@ -404,7 +421,7 @@  int cmd_list(int argc, const char **argv, void *ctx)
 							jregions);
 			}
 
-			jregion = region_to_json(region);
+			jregion = region_to_json(region, list.media_errors);
 			if (!jregion) {
 				fail("\n");
 				continue;
diff --git a/ndctl/namespace.c b/ndctl/namespace.c
index 89b9b6a..6e150b1 100644
--- a/ndctl/namespace.c
+++ b/ndctl/namespace.c
@@ -392,7 +392,7 @@  static int setup_namespace(struct ndctl_region *region,
 		error("%s: failed to enable\n",
 				ndctl_namespace_get_devname(ndns));
 	} else {
-		struct json_object *jndns = util_namespace_to_json(ndns, 0, 1);
+		struct json_object *jndns = util_namespace_to_json(ndns, 0, 1, 0);
 
 		if (jndns)
 			printf("%s\n", json_object_to_json_string_ext(jndns,
diff --git a/util/json.c b/util/json.c
index 07fd113..9f1bff3 100644
--- a/util/json.c
+++ b/util/json.c
@@ -233,19 +233,180 @@  struct json_object *util_daxctl_region_to_json(struct daxctl_region *region,
 	return NULL;
 }
 
+struct json_object *util_region_badblocks_to_json(struct ndctl_region *region,
+		bool include_media_errors, unsigned int *bb_count)
+{
+	struct json_object *jbb = NULL, *jbbs = NULL, *jobj;
+	struct badblock *bb;
+	int bbs = 0;
+
+	if (include_media_errors) {
+		jbbs = json_object_new_array();
+		if (!jbbs)
+			return NULL;
+	}
+
+	ndctl_region_badblock_foreach(region, bb) {
+		if (include_media_errors) {
+			jbb = json_object_new_object();
+			if (!jbb)
+				goto err_array;
+
+			jobj = json_object_new_int64(bb->offset);
+			if (!jobj)
+				goto err;
+			json_object_object_add(jbb, "offset", jobj);
+
+			jobj = json_object_new_int(bb->len);
+			if (!jobj)
+				goto err;
+			json_object_object_add(jbb, "length", jobj);
+
+			json_object_array_add(jbbs, jbb);
+		}
+
+		bbs++;
+	}
+
+	*bb_count = bbs;
+
+	if (bbs)
+		return jbbs;
+
+ err:
+	json_object_put(jbb);
+ err_array:
+	json_object_put(jbbs);
+	return NULL;
+}
+
+static struct json_object *dev_badblocks_to_json(struct ndctl_region *region,
+		unsigned long long dev_begin, unsigned long long dev_size,
+		bool include_media_errors, unsigned int *bb_count)
+{
+	struct json_object *jbb = NULL, *jbbs = NULL, *jobj;
+	unsigned long long region_begin, dev_end, offset;
+	unsigned int len, bbs = 0;
+	struct badblock *bb;
+
+	region_begin = ndctl_region_get_resource(region);
+	if (region_begin == ULLONG_MAX)
+		return NULL;
+
+	dev_end = dev_begin + dev_size - 1;
+
+	if (include_media_errors) {
+		jbbs = json_object_new_array();
+		if (!jbbs)
+			return NULL;
+	}
+
+	ndctl_region_badblock_foreach(region, bb) {
+		unsigned long long bb_begin, bb_end, begin, end;
+
+		bb_begin = region_begin + (bb->offset << 9);
+		bb_end = bb_begin + (bb->len << 9) - 1;
+
+		if (bb_end <= dev_begin || bb_begin >= dev_end)
+			continue;
+
+		if (bb_begin < dev_begin)
+			begin = dev_begin;
+		else
+			begin = bb_begin;
+
+		if (bb_end > dev_end)
+			end = dev_end;
+		else
+			end = bb_end;
+
+		offset = (begin - dev_begin) >> 9;
+		len = (end - begin + 1) >> 9;
+
+		if (include_media_errors) {
+			/* add to json */
+			jbb = json_object_new_object();
+			if (!jbb)
+				goto err_array;
+
+			jobj = json_object_new_int64(offset);
+			if (!jobj)
+				goto err;
+			json_object_object_add(jbb, "offset", jobj);
+
+			jobj = json_object_new_int(len);
+			if (!jobj)
+				goto err;
+			json_object_object_add(jbb, "length", jobj);
+
+			json_object_array_add(jbbs, jbb);
+		}
+		bbs++;
+	}
+
+	*bb_count = bbs;
+
+	if (bbs)
+		return jbbs;
+
+ err:
+	json_object_put(jbb);
+ err_array:
+	json_object_put(jbbs);
+	return NULL;
+}
+
+struct json_object *util_pfn_badblocks_to_json(struct ndctl_pfn *pfn,
+		bool include_media_errors, unsigned int *bb_count)
+{
+	struct ndctl_region *region = ndctl_pfn_get_region(pfn);
+	unsigned long long pfn_begin, pfn_size;
+
+	pfn_begin = ndctl_pfn_get_resource(pfn);
+	if (pfn_begin == ULLONG_MAX)
+		return NULL;
+
+	pfn_size = ndctl_pfn_get_size(pfn);
+	if (pfn_size == ULLONG_MAX)
+		return NULL;
+
+	return dev_badblocks_to_json(region, pfn_begin, pfn_size,
+			include_media_errors, bb_count);
+}
+
+struct json_object *util_dax_badblocks_to_json(struct ndctl_dax *dax,
+		bool include_media_errors, unsigned int *bb_count)
+{
+	struct ndctl_region *region = ndctl_dax_get_region(dax);
+	unsigned long long dax_begin, dax_size;
+
+	dax_begin = ndctl_dax_get_resource(dax);
+	if (dax_begin == ULLONG_MAX)
+		return NULL;
+
+	dax_size = ndctl_dax_get_size(dax);
+	if (dax_size == ULLONG_MAX)
+		return NULL;
+
+	return dev_badblocks_to_json(region, dax_begin, dax_size,
+			include_media_errors, bb_count);
+}
+
 struct json_object *util_namespace_to_json(struct ndctl_namespace *ndns,
-		bool include_idle, bool include_dax)
+		bool include_idle, bool include_dax,
+		bool include_media_errors)
 {
 	struct json_object *jndns = json_object_new_object();
 	unsigned long long size = ULLONG_MAX;
 	enum ndctl_namespace_mode mode;
-	struct json_object *jobj;
+	struct json_object *jobj, *jobj2;
 	const char *bdev = NULL;
 	struct ndctl_btt *btt;
 	struct ndctl_pfn *pfn;
 	struct ndctl_dax *dax;
 	char buf[40];
 	uuid_t uuid;
+	unsigned int bb_count;
 
 	if (!jndns)
 		return NULL;
@@ -368,6 +529,36 @@  struct json_object *util_namespace_to_json(struct ndctl_namespace *ndns,
 		json_object_object_add(jndns, "state", jobj);
 	}
 
+	if (pfn)
+		jobj2 = util_pfn_badblocks_to_json(pfn, include_media_errors,
+				&bb_count);
+	else if (dax)
+		jobj2 = util_dax_badblocks_to_json(dax, include_media_errors,
+				&bb_count);
+	else if (btt) /* do nothing for now */
+		jobj2 = NULL;
+	else {
+		struct ndctl_region *region =
+			ndctl_namespace_get_region(ndns);
+
+		jobj2 = util_region_badblocks_to_json(region,
+				include_media_errors, &bb_count);
+	}
+
+	jobj = json_object_new_boolean((bool)!!bb_count);
+	if (!jobj) {
+		json_object_put(jobj2);
+		goto err;
+	}
+
+	/* no BTT support yet */
+	if (!btt) {
+		json_object_object_add(jndns, "has_badblocks", jobj);
+		if (include_media_errors && jobj2)
+				json_object_object_add(jndns,
+						"badblocks", jobj2);
+	}
+
 	return jndns;
  err:
 	json_object_put(jndns);
diff --git a/util/json.h b/util/json.h
index 2449c2d..0f47267 100644
--- a/util/json.h
+++ b/util/json.h
@@ -10,9 +10,15 @@  struct json_object *util_bus_to_json(struct ndctl_bus *bus);
 struct json_object *util_dimm_to_json(struct ndctl_dimm *dimm);
 struct json_object *util_mapping_to_json(struct ndctl_mapping *mapping);
 struct json_object *util_namespace_to_json(struct ndctl_namespace *ndns,
-		bool include_idle, bool include_dax);
+		bool include_idle, bool include_dax, bool include_media_errs);
 struct daxctl_region;
 struct daxctl_dev;
+struct json_object *util_dax_badblocks_to_json(struct ndctl_dax *dax,
+		bool include_media_errors, unsigned int *bb_count);
+struct json_object *util_pfn_badblocks_to_json(struct ndctl_pfn *pfn,
+		bool include_media_errors, unsigned int *bb_count);
+struct json_object *util_region_badblocks_to_json(struct ndctl_region *region,
+		bool include_media_errors, unsigned int *bb_count);
 struct json_object *util_daxctl_region_to_json(struct daxctl_region *region,
 		bool include_devs, const char *ident, bool include_idle);
 struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev);