diff mbox

acpi/nfit: export read_only attribute of dimms

Message ID 20171030204127.4779-1-Lijun.Pan@dell.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lijun.Pan@dell.com Oct. 30, 2017, 8:41 p.m. UTC
Though flags attribute provides enough information about
the dimm, it is nice to export the read_only attribute if
bit3 of NVDIMM state flag is set.
If error is injected by BIOS, bit3 and bit1 are both set.
If DIMM is set to read-only by BIOS, bit3 is set.
Hence bit3 is good enough to tell whether the dimm is in
read-only mode or not.

Signed-off-by: Lijun Pan <Lijun.Pan@dell.com>
---
 drivers/acpi/nfit/core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Dan Williams Nov. 6, 2017, 11:32 p.m. UTC | #1
On Mon, Oct 30, 2017 at 1:41 PM, Lijun Pan <Lijun.Pan@dell.com> wrote:
> Though flags attribute provides enough information about
> the dimm, it is nice to export the read_only attribute if
> bit3 of NVDIMM state flag is set.
> If error is injected by BIOS, bit3 and bit1 are both set.
> If DIMM is set to read-only by BIOS, bit3 is set.
> Hence bit3 is good enough to tell whether the dimm is in
> read-only mode or not.

Hmm, can you say about more about why we need this additional attribute?

Applications don't read and write the DIMM directly, they only
interface with the DIMM through namespaces of a given region. The
region device already has a 'read_only' attribute. See
read_only_show() in drivers/nvdimm/region_devs.c.
Elliott, Robert (Servers) Nov. 7, 2017, 2:50 a.m. UTC | #2
> -----Original Message-----
> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On
> Behalf Of Dan Williams
> Sent: Monday, November 06, 2017 5:32 PM
> To: Lijun Pan <Lijun.Pan@dell.com>
> Cc: linux-nvdimm@lists.01.org
> Subject: Re: [PATCH] acpi/nfit: export read_only attribute of dimms
> 
> On Mon, Oct 30, 2017 at 1:41 PM, Lijun Pan <Lijun.Pan@dell.com>
> wrote:
> > Though flags attribute provides enough information about
> > the dimm, it is nice to export the read_only attribute if
> > bit3 of NVDIMM state flag is set.
> > If error is injected by BIOS, bit3 and bit1 are both set.
> > If DIMM is set to read-only by BIOS, bit3 is set.
> > Hence bit3 is good enough to tell whether the dimm is in
> > read-only mode or not.
> 
> Hmm, can you say about more about why we need this additional
> attribute?
> 
> Applications don't read and write the DIMM directly, they only
> interface with the DIMM through namespaces of a given region. The
> region device already has a 'read_only' attribute. See
> read_only_show() in drivers/nvdimm/region_devs.c.

A few more points:

1. The full NVDIMM State Flags value is already reported for
the nmem device, so reporting bit 3 separately as "read_only"
would be redundant.

2. For NVDIMM-Ns, the system is not preventing writing to memory;
system firmware is just warning that any writes will not be
persistent across power loss.

3. Considering the different layers:
* the pmem and btt drivers blocks writes themselves, so it makes
sense for them to report ro at the pmem device layer.
* the nmem layer doesn't block writes, so shouldn't report anything
about them being restricted.
* the region layer doesn't block writes, so shouldn't report anything
about them being restricted.
* the region layer does need to pass along the NVDIMM state flags
to the pmem and btt drivers; read_only might not be the best name
as a user-visible attribute with that information.

Some driver or software might find temporarily writing to those
locations in-place would be useful (e.g. run fsck before copying
the repaired filesystem content to another storage device), so we
shouldn't cut them off unnecessarily.

4. The ACPI definition is a bit open:
"Bit [3] set to 1 indicates that the NVDIMM containing the NVDIMM
region is not able to accept persistent writes. For an
energy-source backed NVDIMM device, Bit [3] is set if it is not
armed or the previous ERASE operation did not complete."

Perhaps NFIT needs to distinguish between these situations:
* writes result in errors
  - this would require special support from the CPU, memory
    controller, and memory bus
* writes are accepted but ignored
  - e.g., an NVDIMM-P type device made from SCM
  - ignored writes create cache coherency problems
* writes are accepted, but not persistent across power loss
  - e.g., an NVDIMM-N

---
Robert Elliott, HPE Persistent Memory
Dan Williams Nov. 7, 2017, 3:35 p.m. UTC | #3
On Mon, Nov 6, 2017 at 6:50 PM, Elliott, Robert (Persistent Memory)
<elliott@hpe.com> wrote:
>
>
>> -----Original Message-----
>> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On
>> Behalf Of Dan Williams
>> Sent: Monday, November 06, 2017 5:32 PM
>> To: Lijun Pan <Lijun.Pan@dell.com>
>> Cc: linux-nvdimm@lists.01.org
>> Subject: Re: [PATCH] acpi/nfit: export read_only attribute of dimms
>>
>> On Mon, Oct 30, 2017 at 1:41 PM, Lijun Pan <Lijun.Pan@dell.com>
>> wrote:
>> > Though flags attribute provides enough information about
>> > the dimm, it is nice to export the read_only attribute if
>> > bit3 of NVDIMM state flag is set.
>> > If error is injected by BIOS, bit3 and bit1 are both set.
>> > If DIMM is set to read-only by BIOS, bit3 is set.
>> > Hence bit3 is good enough to tell whether the dimm is in
>> > read-only mode or not.
>>
>> Hmm, can you say about more about why we need this additional
>> attribute?
>>
>> Applications don't read and write the DIMM directly, they only
>> interface with the DIMM through namespaces of a given region. The
>> region device already has a 'read_only' attribute. See
>> read_only_show() in drivers/nvdimm/region_devs.c.
>
> A few more points:
>
> 1. The full NVDIMM State Flags value is already reported for
> the nmem device, so reporting bit 3 separately as "read_only"
> would be redundant.
>
> 2. For NVDIMM-Ns, the system is not preventing writing to memory;
> system firmware is just warning that any writes will not be
> persistent across power loss.
>
> 3. Considering the different layers:
> * the pmem and btt drivers blocks writes themselves, so it makes
> sense for them to report ro at the pmem device layer.
> * the nmem layer doesn't block writes, so shouldn't report anything
> about them being restricted.
> * the region layer doesn't block writes, so shouldn't report anything
> about them being restricted.

The rationale for surfacing the read_only attribute at the region
level, even though the region does not directly host I/O, is that
regions host attributes that are common to all namespaces in that
region. I see regions as distinct from the nmem level where there is
not necessarily a direct correlation to namespaces, especially when
nmems are optional in the device hierarchy. Regions are also the
gateway interface for setting labels and aggregating state across
several DIMMs.

> * the region layer does need to pass along the NVDIMM state flags
> to the pmem and btt drivers; read_only might not be the best name
> as a user-visible attribute with that information.

The state flags are ACPI specific, so we need to translate to
something Linux generic when it moves to the libnvdimm sub-system.
Read-only when the energy source fails is a preventative measure, the
other state flags indicate that data has already been lost.

> Some driver or software might find temporarily writing to those
> locations in-place would be useful (e.g. run fsck before copying
> the repaired filesystem content to another storage device), so we
> shouldn't cut them off unnecessarily.

The read_only attribute can be overridden, i.e. it's only a default
value for safety.
Lijun.Pan@dell.com Nov. 8, 2017, 1:18 a.m. UTC | #4
> -----Original Message-----
> From: Dan Williams [mailto:dan.j.williams@intel.com]
> Sent: Monday, November 6, 2017 5:32 PM
> To: Pan, Lijun <Lijun_Pan@Dell.com>
> Cc: linux-nvdimm@lists.01.org
> Subject: Re: [PATCH] acpi/nfit: export read_only attribute of dimms
> 
> On Mon, Oct 30, 2017 at 1:41 PM, Lijun Pan <Lijun.Pan@dell.com> wrote:
> > Though flags attribute provides enough information about
> > the dimm, it is nice to export the read_only attribute if
> > bit3 of NVDIMM state flag is set.
> > If error is injected by BIOS, bit3 and bit1 are both set.
> > If DIMM is set to read-only by BIOS, bit3 is set.
> > Hence bit3 is good enough to tell whether the dimm is in
> > read-only mode or not.
> 
> Hmm, can you say about more about why we need this additional attribute?
> 
> Applications don't read and write the DIMM directly, they only
> interface with the DIMM through namespaces of a given region. The
> region device already has a 'read_only' attribute. See
> read_only_show() in drivers/nvdimm/region_devs.c.

redhat's cockpit can show the server's info like dimm, cpu, network,etc.
If the read_only attribute is exported, it is easy for the program to check.
I agree read_only status of the specific dimm can be retrieved via checking
bit3 of nvdimm state flag, but it is not explicit.
If 2 nvdimms are interleaved together, region's read_only only shows that
the region is read only, it does not show the 2 underlying physical dimms comprising
the region are read only.
Dan Williams Nov. 8, 2017, 2:17 a.m. UTC | #5
On Tue, Nov 7, 2017 at 5:18 PM,  <Lijun.Pan@dell.com> wrote:
>
>
>> -----Original Message-----
>> From: Dan Williams [mailto:dan.j.williams@intel.com]
>> Sent: Monday, November 6, 2017 5:32 PM
>> To: Pan, Lijun <Lijun_Pan@Dell.com>
>> Cc: linux-nvdimm@lists.01.org
>> Subject: Re: [PATCH] acpi/nfit: export read_only attribute of dimms
>>
>> On Mon, Oct 30, 2017 at 1:41 PM, Lijun Pan <Lijun.Pan@dell.com> wrote:
>> > Though flags attribute provides enough information about
>> > the dimm, it is nice to export the read_only attribute if
>> > bit3 of NVDIMM state flag is set.
>> > If error is injected by BIOS, bit3 and bit1 are both set.
>> > If DIMM is set to read-only by BIOS, bit3 is set.
>> > Hence bit3 is good enough to tell whether the dimm is in
>> > read-only mode or not.
>>
>> Hmm, can you say about more about why we need this additional attribute?
>>
>> Applications don't read and write the DIMM directly, they only
>> interface with the DIMM through namespaces of a given region. The
>> region device already has a 'read_only' attribute. See
>> read_only_show() in drivers/nvdimm/region_devs.c.
>
> redhat's cockpit can show the server's info like dimm, cpu, network,etc.
> If the read_only attribute is exported, it is easy for the program to check.
> I agree read_only status of the specific dimm can be retrieved via checking
> bit3 of nvdimm state flag, but it is not explicit.
> If 2 nvdimms are interleaved together, region's read_only only shows that
> the region is read only, it does not show the 2 underlying physical dimms comprising
> the region are read only.

Yes, however the DIMM itself is not "read only", that's a default
policy the kernel decides when it see the "not armed" status flag set.
If cockpit wants to show this state it can get it today from the DIMM
state flags emitted by 'ndctl list', see "flag_failed_arm".

# ndctl list -Dd nmem6
{
  "dev":"nmem6",
  "id":"cdab-0a-07e0-fffffeff",
  "handle":65536,
  "phys_id":0,
  "flag_failed_arm":true,
}
diff mbox

Patch

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index ebe0857ac346..f96e65aa29dd 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1480,6 +1480,16 @@  static ssize_t flags_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(flags);
 
+static ssize_t read_only_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	u16 flags = to_nfit_memdev(dev)->flags;
+
+	return sprintf(buf, "%d\n",
+		flags & ACPI_NFIT_MEM_NOT_ARMED ? 1 : 0);
+}
+static DEVICE_ATTR_RO(read_only);
+
 static ssize_t id_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -1512,6 +1522,7 @@  static struct attribute *acpi_nfit_dimm_attributes[] = {
 	&dev_attr_format1.attr,
 	&dev_attr_serial.attr,
 	&dev_attr_flags.attr,
+	&dev_attr_read_only.attr,
 	&dev_attr_id.attr,
 	&dev_attr_family.attr,
 	&dev_attr_dsm_mask.attr,