Message ID | CAGXu5jKUP9o-ZgW5Wa5-9DHeQNZ5VA3cLBCJ0P7AVkTQ5tqHtQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nvdimm crash at boot | expand |
On Tue, Jan 8, 2019 at 3:10 PM Kees Cook <keescook@chromium.org> wrote: > > This is a warn that I added to fail more gracefully (sorry for > whitespace damage): > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > index 4890310df874..1161b994b1ec 100644 > --- a/drivers/nvdimm/dimm_devs.c > +++ b/drivers/nvdimm/dimm_devs.c > @@ -516,6 +516,8 @@ static umode_t nvdimm_visible(struct kobject > *kobj, struct attribute *a, int n) > return a->mode; > if (nvdimm->sec.state < 0) > return 0; > + if (WARN_ON_ONCE(!nvdimm->sec.ops)) > + return 0; > /* Are there any state mutation ops? */ > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable > || nvdimm->sec.ops->change_key > > Without it, I would crash at boot due to the sec.ops dereference. It's > not clear to me if there is a better solution than just the sec.ops > NULL test (i.e. should it ever be NULL?) It will always be NULL for anything other than real nvdimms with security support. > > [ 1.393599] WARNING: CPU: 3 PID: 484 at > drivers/nvdimm/dimm_devs.c:519 nvdimm_visible+0x79/0x80 > [ 1.393858] Modules linked in: > [ 1.393858] CPU: 3 PID: 484 Comm: kworker/u8:3 Not tainted 5.0.0-rc1+ #926 > [ 1.393858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.10.2-1ubuntu1 04/01/2014 > [ 1.396781] Workqueue: events_unbound async_run_entry_fn > [ 1.396781] RIP: 0010:nvdimm_visible+0x79/0x80 > [ 1.396781] Code: e8 4c fc ff ff eb c7 48 83 78 20 00 75 e6 48 83 > 78 10 00 75 df 48 83 78 28 00 75 d8 48 83 78 30 00 75 d1 b8 24 01 00 > 00 eb b1 <0f> 0b eb ad 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 > [ 1.396781] RSP: 0000:ffffb911803abd00 EFLAGS: 00010246 > [ 1.396781] RAX: 0000000000000000 RBX: ffffffff98cf5a80 RCX: 00000000000001a4 > [ 1.396781] RDX: 0000000000000004 RSI: ffffffff98cf5a80 RDI: ffff94e7ed088028 > [ 1.396781] RBP: ffffb911803abd10 R08: 0000000000000000 R09: 0000000000000001 > [ 1.396781] R10: ffffb911803abaf8 R11: 0000000000000000 R12: ffff94e7ed088028 > [ 1.396781] R13: ffff94e7ed088028 R14: ffffffff98cf5a60 R15: 0000000000000000 > [ 1.396781] FS: 0000000000000000(0000) GS:ffff94e7efb80000(0000) > knlGS:0000000000000000 > [ 1.396781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.396781] CR2: 00000000ffffffff CR3: 0000000150822001 CR4: 00000000001606e0 > [ 1.396781] Call Trace: > [ 1.396781] internal_create_group+0xf4/0x380 > [ 1.396781] sysfs_create_groups+0x46/0xb0 > [ 1.396781] device_add+0x331/0x680 > [ 1.396781] nd_async_device_register+0x15/0x60 > [ 1.396781] async_run_entry_fn+0x38/0x100 > [ 1.396781] process_one_work+0x22b/0x5a0 > [ 1.396781] worker_thread+0x3f/0x3b0 > [ 1.396781] kthread+0x12b/0x150 > [ 1.396781] ? process_one_work+0x5a0/0x5a0 > [ 1.396781] ? kthread_park+0xa0/0xa0 > [ 1.396781] ret_from_fork+0x24/0x30 > [ 1.396781] irq event stamp: 952 > [ 1.396781] hardirqs last enabled at (951): [<ffffffff973f5cb4>] > __slab_alloc.constprop.79+0x44/0x70 > [ 1.396781] hardirqs last disabled at (952): [<ffffffff97201cf0>] > trace_hardirqs_off_thunk+0x1a/0x1c > [ 1.396781] softirqs last enabled at (0): [<ffffffff97267ae3>] > copy_process.part.55+0x413/0x1f10 > [ 1.396781] softirqs last disabled at (0): [<0000000000000000>] > (null) > [ 1.396781] ---[ end trace 5608ce056f09564f ]--- > > I assume this crash is due to be using nvdimm without any special > markings (i.e. I'm using it crudely with pstore), in KVM: > > RAM_SIZE=16384 > NVDIMM_SIZE=128 > MAX_SIZE=$(( RAM_SIZE + NVDIMM_SIZE )) > > sudo qemu-system-x86_64 \ > ... > -machine pc,nvdimm \ > -m ${RAM_SIZE}M,slots=2,maxmem=${MAX_SIZE}M \ > -object > memory-backend-file,id=mem1,share=on,mem-path=nvdimm.img,size=${NVDIMM_SIZE}M,align=128M > \ > -device nvdimm,id=nvdimm1,memdev=mem1 \ Ah, thanks for the report! The key difference is that you don't define a "label area", so the driver bails out early and never initializes the security state. This should fix it up. diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c index 4890310df874..636cdb06ee17 100644 --- a/drivers/nvdimm/dimm_devs.c +++ b/drivers/nvdimm/dimm_devs.c @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject *kobj, struct attribute *a, int n) if (a != &dev_attr_security.attr) return a->mode; - if (nvdimm->sec.state < 0) + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) return 0; /* Are there any state mutation ops? */ if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable
On Tue, Jan 8, 2019 at 3:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > Ah, thanks for the report! The key difference is that you don't define > a "label area", so the driver bails out early and never initializes > the security state. > > This should fix it up. > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > index 4890310df874..636cdb06ee17 100644 > --- a/drivers/nvdimm/dimm_devs.c > +++ b/drivers/nvdimm/dimm_devs.c > @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject > *kobj, struct attribute *a, int n) > > if (a != &dev_attr_security.attr) > return a->mode; > - if (nvdimm->sec.state < 0) > + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) > return 0; > /* Are there any state mutation ops? */ > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable Okay, cool. I wasn't sure if that test needed a deeper check. :) Fixes: 37833fb7989a9 ("acpi/nfit, libnvdimm: Add freeze security support to Intel nvdimm") Tested-by: Kees Cook <keescook@chromium.org> Thanks!
On Tue, Jan 8, 2019 at 3:34 PM Kees Cook <keescook@chromium.org> wrote: > > On Tue, Jan 8, 2019 at 3:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > Ah, thanks for the report! The key difference is that you don't define > > a "label area", so the driver bails out early and never initializes > > the security state. > > > > This should fix it up. > > > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > > index 4890310df874..636cdb06ee17 100644 > > --- a/drivers/nvdimm/dimm_devs.c > > +++ b/drivers/nvdimm/dimm_devs.c > > @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject > > *kobj, struct attribute *a, int n) > > > > if (a != &dev_attr_security.attr) > > return a->mode; > > - if (nvdimm->sec.state < 0) > > + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) > > return 0; > > /* Are there any state mutation ops? */ > > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable > > Okay, cool. I wasn't sure if that test needed a deeper check. :) > > Fixes: 37833fb7989a9 ("acpi/nfit, libnvdimm: Add freeze security > support to Intel nvdimm") > Tested-by: Kees Cook <keescook@chromium.org> > Actually, looking closer this should have been avoided by the fact that __nvdimm_create() initializes the security state early and that nvdimm->sec.state should have saved us. I'll dig a bit deeper with your qemu config.
On Tue, Jan 8, 2019 at 3:54 PM Dan Williams <dan.j.williams@intel.com> wrote: > > On Tue, Jan 8, 2019 at 3:34 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Tue, Jan 8, 2019 at 3:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > Ah, thanks for the report! The key difference is that you don't define > > > a "label area", so the driver bails out early and never initializes > > > the security state. > > > > > > This should fix it up. > > > > > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > > > index 4890310df874..636cdb06ee17 100644 > > > --- a/drivers/nvdimm/dimm_devs.c > > > +++ b/drivers/nvdimm/dimm_devs.c > > > @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject > > > *kobj, struct attribute *a, int n) > > > > > > if (a != &dev_attr_security.attr) > > > return a->mode; > > > - if (nvdimm->sec.state < 0) > > > + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) > > > return 0; > > > /* Are there any state mutation ops? */ > > > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable > > > > Okay, cool. I wasn't sure if that test needed a deeper check. :) > > > > Fixes: 37833fb7989a9 ("acpi/nfit, libnvdimm: Add freeze security > > support to Intel nvdimm") > > Tested-by: Kees Cook <keescook@chromium.org> > > > > Actually, looking closer this should have been avoided by the fact > that __nvdimm_create() initializes the security state early and that > nvdimm->sec.state should have saved us. > > I'll dig a bit deeper with your qemu config. Maybe something goes weird with pstore stealing the region?
On Tue, Jan 8, 2019 at 3:55 PM Kees Cook <keescook@chromium.org> wrote: > > On Tue, Jan 8, 2019 at 3:54 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > On Tue, Jan 8, 2019 at 3:34 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Tue, Jan 8, 2019 at 3:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > Ah, thanks for the report! The key difference is that you don't define > > > > a "label area", so the driver bails out early and never initializes > > > > the security state. > > > > > > > > This should fix it up. > > > > > > > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > > > > index 4890310df874..636cdb06ee17 100644 > > > > --- a/drivers/nvdimm/dimm_devs.c > > > > +++ b/drivers/nvdimm/dimm_devs.c > > > > @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject > > > > *kobj, struct attribute *a, int n) > > > > > > > > if (a != &dev_attr_security.attr) > > > > return a->mode; > > > > - if (nvdimm->sec.state < 0) > > > > + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) > > > > return 0; > > > > /* Are there any state mutation ops? */ > > > > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable > > > > > > Okay, cool. I wasn't sure if that test needed a deeper check. :) > > > > > > Fixes: 37833fb7989a9 ("acpi/nfit, libnvdimm: Add freeze security > > > support to Intel nvdimm") > > > Tested-by: Kees Cook <keescook@chromium.org> > > > > > > > Actually, looking closer this should have been avoided by the fact > > that __nvdimm_create() initializes the security state early and that > > nvdimm->sec.state should have saved us. > > > > I'll dig a bit deeper with your qemu config. > > Maybe something goes weird with pstore stealing the region? No, pstore is off the hook. I was just able to reproduce locally and I'm not doing anything with pstore.
On Tue, Jan 8, 2019 at 4:02 PM Dan Williams <dan.j.williams@intel.com> wrote: > > On Tue, Jan 8, 2019 at 3:55 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Tue, Jan 8, 2019 at 3:54 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > On Tue, Jan 8, 2019 at 3:34 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > On Tue, Jan 8, 2019 at 3:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > > Ah, thanks for the report! The key difference is that you don't define > > > > > a "label area", so the driver bails out early and never initializes > > > > > the security state. > > > > > > > > > > This should fix it up. > > > > > > > > > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > > > > > index 4890310df874..636cdb06ee17 100644 > > > > > --- a/drivers/nvdimm/dimm_devs.c > > > > > +++ b/drivers/nvdimm/dimm_devs.c > > > > > @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject > > > > > *kobj, struct attribute *a, int n) > > > > > > > > > > if (a != &dev_attr_security.attr) > > > > > return a->mode; > > > > > - if (nvdimm->sec.state < 0) > > > > > + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) > > > > > return 0; > > > > > /* Are there any state mutation ops? */ > > > > > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable > > > > > > > > Okay, cool. I wasn't sure if that test needed a deeper check. :) > > > > > > > > Fixes: 37833fb7989a9 ("acpi/nfit, libnvdimm: Add freeze security > > > > support to Intel nvdimm") > > > > Tested-by: Kees Cook <keescook@chromium.org> > > > > > > > > > > Actually, looking closer this should have been avoided by the fact > > > that __nvdimm_create() initializes the security state early and that > > > nvdimm->sec.state should have saved us. > > > > > > I'll dig a bit deeper with your qemu config. > > > > Maybe something goes weird with pstore stealing the region? > > No, pstore is off the hook. I was just able to reproduce locally and > I'm not doing anything with pstore. Huh, this fixes it: diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h index 5440f11b0907..7315977b64da 100644 --- a/include/linux/libnvdimm.h +++ b/include/linux/libnvdimm.h @@ -160,6 +160,7 @@ static inline struct nd_blk_region_desc *to_blk_region_desc( } enum nvdimm_security_state { + NVDIMM_SECURITY_ERROR = -1, NVDIMM_SECURITY_DISABLED, NVDIMM_SECURITY_UNLOCKED, NVDIMM_SECURITY_LOCKED, Apparently I was wrong to think an enum was a signed int without actually making a signed value a possibility. I would have a expected the compiler to give me a "statement has no effect" for testing for a negative value against an effectively unsigned quantity.
On Tue, Jan 8, 2019 at 4:49 PM Dan Williams <dan.j.williams@intel.com> wrote: > > On Tue, Jan 8, 2019 at 4:02 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > On Tue, Jan 8, 2019 at 3:55 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Tue, Jan 8, 2019 at 3:54 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > > > On Tue, Jan 8, 2019 at 3:34 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > > > On Tue, Jan 8, 2019 at 3:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > Ah, thanks for the report! The key difference is that you don't define > > > > > > a "label area", so the driver bails out early and never initializes > > > > > > the security state. > > > > > > > > > > > > This should fix it up. > > > > > > > > > > > > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > > > > > > index 4890310df874..636cdb06ee17 100644 > > > > > > --- a/drivers/nvdimm/dimm_devs.c > > > > > > +++ b/drivers/nvdimm/dimm_devs.c > > > > > > @@ -514,7 +514,7 @@ static umode_t nvdimm_visible(struct kobject > > > > > > *kobj, struct attribute *a, int n) > > > > > > > > > > > > if (a != &dev_attr_security.attr) > > > > > > return a->mode; > > > > > > - if (nvdimm->sec.state < 0) > > > > > > + if (!nvdimm->sec.ops || nvdimm->sec.state < 0) > > > > > > return 0; > > > > > > /* Are there any state mutation ops? */ > > > > > > if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable > > > > > > > > > > Okay, cool. I wasn't sure if that test needed a deeper check. :) > > > > > > > > > > Fixes: 37833fb7989a9 ("acpi/nfit, libnvdimm: Add freeze security > > > > > support to Intel nvdimm") > > > > > Tested-by: Kees Cook <keescook@chromium.org> > > > > > > > > > > > > > Actually, looking closer this should have been avoided by the fact > > > > that __nvdimm_create() initializes the security state early and that > > > > nvdimm->sec.state should have saved us. > > > > > > > > I'll dig a bit deeper with your qemu config. > > > > > > Maybe something goes weird with pstore stealing the region? > > > > No, pstore is off the hook. I was just able to reproduce locally and > > I'm not doing anything with pstore. > > Huh, this fixes it: > > diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h > index 5440f11b0907..7315977b64da 100644 > --- a/include/linux/libnvdimm.h > +++ b/include/linux/libnvdimm.h > @@ -160,6 +160,7 @@ static inline struct nd_blk_region_desc *to_blk_region_desc( > } > > enum nvdimm_security_state { > + NVDIMM_SECURITY_ERROR = -1, > NVDIMM_SECURITY_DISABLED, > NVDIMM_SECURITY_UNLOCKED, > NVDIMM_SECURITY_LOCKED, > > Apparently I was wrong to think an enum was a signed int without > actually making a signed value a possibility. I would have a expected > the compiler to give me a "statement has no effect" for testing for a > negative value against an effectively unsigned quantity. Thanks for the one-line patch! It fixed the same crash for me. -- Dexuan
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c index 4890310df874..1161b994b1ec 100644 --- a/drivers/nvdimm/dimm_devs.c +++ b/drivers/nvdimm/dimm_devs.c @@ -516,6 +516,8 @@ static umode_t nvdimm_visible(struct kobject *kobj, struct attribute *a, int n) return a->mode; if (nvdimm->sec.state < 0) return 0; + if (WARN_ON_ONCE(!nvdimm->sec.ops)) + return 0; /* Are there any state mutation ops? */ if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable