mbox series

[0/3] acpi, nfit: Add dirty shutdown count to sysfs

Message ID 153802226065.833068.11943510429252969385.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
Headers show
Series acpi, nfit: Add dirty shutdown count to sysfs | expand

Message

Dan Williams Sept. 27, 2018, 4:24 a.m. UTC
The Intel NVDIMM command specification publishes a dirty-shutdown-count
in addition to the dirty-shutdown / flush-failed indication that comes
from the ACPI NFIT. This is expected to be a common property of NVDIMMs
and is a static hardware health detail to be cached / exported via
sysfs.

Add plumbing for retrieving this data at driver load time, publish the
count, and use the dynamically retrieved dirty-shutdown indicator to
augment the existing 'flush_failed' flag.

---

Dan Williams (3):
      acpi, nfit: Introduce nfit_mem flags
      acpi, nfit: Collect shutdown status
      tools/testing/nvdimm: Populate dirty shutdown data


 drivers/acpi/nfit/core.c              |  115 ++++++++++++++++++++++++++++-----
 drivers/acpi/nfit/intel.h             |   34 ++++++++++
 drivers/acpi/nfit/nfit.h              |   11 +++
 tools/testing/nvdimm/Kbuild           |    1 
 tools/testing/nvdimm/acpi_nfit_test.c |    8 ++
 tools/testing/nvdimm/test/nfit.c      |    3 +
 tools/testing/nvdimm/test/nfit_test.h |   24 -------
 7 files changed, 152 insertions(+), 44 deletions(-)

Comments

Johannes Thumshirn Sept. 27, 2018, 7:11 a.m. UTC | #1
On Wed, Sep 26, 2018 at 09:24:20PM -0700, Dan Williams wrote:
> The Intel NVDIMM command specification publishes a dirty-shutdown-count
> in addition to the dirty-shutdown / flush-failed indication that comes
> from the ACPI NFIT. This is expected to be a common property of NVDIMMs
> and is a static hardware health detail to be cached / exported via
> sysfs.
> 
> Add plumbing for retrieving this data at driver load time, publish the
> count, and use the dynamically retrieved dirty-shutdown indicator to
> augment the existing 'flush_failed' flag.

Is this the same thing as the LSS Latch stuff that went into ndctl?
Keith Busch Sept. 27, 2018, 3:21 p.m. UTC | #2
On Thu, Sep 27, 2018 at 09:11:35AM +0200, Johannes Thumshirn wrote:
> On Wed, Sep 26, 2018 at 09:24:20PM -0700, Dan Williams wrote:
> > The Intel NVDIMM command specification publishes a dirty-shutdown-count
> > in addition to the dirty-shutdown / flush-failed indication that comes
> > from the ACPI NFIT. This is expected to be a common property of NVDIMMs
> > and is a static hardware health detail to be cached / exported via
> > sysfs.
> > 
> > Add plumbing for retrieving this data at driver load time, publish the
> > count, and use the dynamically retrieved dirty-shutdown indicator to
> > augment the existing 'flush_failed' flag.
> 
> Is this the same thing as the LSS Latch stuff that went into ndctl?

On a related note, the ndctl latch implementation doesn't satisfy all
the needs, so I expect it'll be reverted

  https://lists.01.org/pipermail/linux-nvdimm/2018-September/017892.html
Dan Williams Sept. 27, 2018, 3:33 p.m. UTC | #3
On Thu, Sep 27, 2018 at 12:12 AM Johannes Thumshirn <jthumshirn@suse.de> wrote:
>
> On Wed, Sep 26, 2018 at 09:24:20PM -0700, Dan Williams wrote:
> > The Intel NVDIMM command specification publishes a dirty-shutdown-count
> > in addition to the dirty-shutdown / flush-failed indication that comes
> > from the ACPI NFIT. This is expected to be a common property of NVDIMMs
> > and is a static hardware health detail to be cached / exported via
> > sysfs.
> >
> > Add plumbing for retrieving this data at driver load time, publish the
> > count, and use the dynamically retrieved dirty-shutdown indicator to
> > augment the existing 'flush_failed' flag.
>
> Is this the same thing as the LSS Latch stuff that went into ndctl?

It's a replacement. The latch mechanism is awkward especially when all
that it needed is a rolling count of dirty-shutdown events. The
expectation going forward is that the platform firmware will handle
the latch, if it is present, and the OS need only consume the
dirty-shutdown count. The ndctl implementation called libndctl apis
from the udev queue which we discovered injects unnecessary udev queue
drains / stalls into the boot path. Lastly, the userspace caching
scheme for non-root users to consume the dirty-shutdown-count just
isn't as efficient as teaching the kernel to cache this value and
export it as a standard sysfs attribute.