Message ID | 1487832999-3390-1-git-send-email-nab@linux-iscsi.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
> From: Nicholas Bellinger <nab@linux-iscsi.org> > > When transport_clear_lun_ref() is shutting down a se_lun via > configfs with new I/O in-flight, it's possible to trigger a > NULL pointer dereference in transport_lookup_cmd_lun() due > to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD > checking before incrementing lun->lun_ref.count after > lun->lun_ref has switched to atomic_t mode. > > This results in a NULL pointer dereference as LUN shutdown > code in core_tpg_remove_lun() continues running after the > existing ->release() -> core_tpg_lun_ref_release() callback > completes, and clears the RCU protected se_lun->lun_se_dev > pointer. > > During the OOPs, the state of lun->lun_ref in the process > which triggered the NULL pointer dereference looks like > the following on v4.1.y stable code: > > struct se_lun { > lun_link_magic = 4294932337, > lun_status = TRANSPORT_LUN_STATUS_FREE, > > ..... > > lun_se_dev = 0x0, > lun_sep = 0x0, > > ..... > > lun_ref = { > count = { > counter = 1 > }, > percpu_count_ptr = 3, > release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>, > confirm_switch = 0x0, > force_atomic = false, > rcu = { > next = 0xffff88154fa1a5d0, > func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu> > } > } > } > > To address this bug, use percpu_ref_tryget_live() to ensure > once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref > has switched to atomic_t, all new I/Os will fail to obtain > a new lun->lun_ref reference. > > Also use an explicit percpu_ref_kill_and_confirm() callback > to block on ->lun_ref_comp to allow the first stage and > associated RCU grace period to complete, and then block on > ->lun_ref_shutdown waiting for the final percpu_ref_put() > to drop the last reference via transport_lun_remove_cmd() > before continuing with core_tpg_remove_lun() shutdown. > > Reported-by: Rob Millner <rlm@daterainc.com> > Tested-by: Rob Millner <rlm@daterainc.com> > Cc: Rob Millner <rlm@daterainc.com> > Tested-by: Vaibhav Tandon <vst@datera.io> > Cc: Vaibhav Tandon <vst@datera.io> > Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> > --- > drivers/target/target_core_device.c | 10 ++++++++-- > drivers/target/target_core_tpg.c | 3 ++- > drivers/target/target_core_transport.c | 31 ++++++++++++++++++++++++++++++- > include/target/target_core_base.h | 1 + > 4 files changed, 41 insertions(+), 4 deletions(-) > I have seen this and have tested this with our custom kernel. So this looks good from me! -Bryant -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2017-02-23 at 11:46 -0600, Bryant G. Ly wrote: > > From: Nicholas Bellinger <nab@linux-iscsi.org> > > > > When transport_clear_lun_ref() is shutting down a se_lun via > > configfs with new I/O in-flight, it's possible to trigger a > > NULL pointer dereference in transport_lookup_cmd_lun() due > > to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD > > checking before incrementing lun->lun_ref.count after > > lun->lun_ref has switched to atomic_t mode. > > > > This results in a NULL pointer dereference as LUN shutdown > > code in core_tpg_remove_lun() continues running after the > > existing ->release() -> core_tpg_lun_ref_release() callback > > completes, and clears the RCU protected se_lun->lun_se_dev > > pointer. > > > > During the OOPs, the state of lun->lun_ref in the process > > which triggered the NULL pointer dereference looks like > > the following on v4.1.y stable code: > > > > struct se_lun { > > lun_link_magic = 4294932337, > > lun_status = TRANSPORT_LUN_STATUS_FREE, > > > > ..... > > > > lun_se_dev = 0x0, > > lun_sep = 0x0, > > > > ..... > > > > lun_ref = { > > count = { > > counter = 1 > > }, > > percpu_count_ptr = 3, > > release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>, > > confirm_switch = 0x0, > > force_atomic = false, > > rcu = { > > next = 0xffff88154fa1a5d0, > > func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu> > > } > > } > > } > > > > To address this bug, use percpu_ref_tryget_live() to ensure > > once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref > > has switched to atomic_t, all new I/Os will fail to obtain > > a new lun->lun_ref reference. > > > > Also use an explicit percpu_ref_kill_and_confirm() callback > > to block on ->lun_ref_comp to allow the first stage and > > associated RCU grace period to complete, and then block on > > ->lun_ref_shutdown waiting for the final percpu_ref_put() > > to drop the last reference via transport_lun_remove_cmd() > > before continuing with core_tpg_remove_lun() shutdown. > > > > Reported-by: Rob Millner <rlm@daterainc.com> > > Tested-by: Rob Millner <rlm@daterainc.com> > > Cc: Rob Millner <rlm@daterainc.com> > > Tested-by: Vaibhav Tandon <vst@datera.io> > > Cc: Vaibhav Tandon <vst@datera.io> > > Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> > > --- > > drivers/target/target_core_device.c | 10 ++++++++-- > > drivers/target/target_core_tpg.c | 3 ++- > > drivers/target/target_core_transport.c | 31 ++++++++++++++++++++++++++++++- > > include/target/target_core_base.h | 1 + > > 4 files changed, 41 insertions(+), 4 deletions(-) > > > I have seen this and have tested this with our custom kernel. > > So this looks good from me! > Added your Tested-by to the patch. Thanks Bryant. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c index cb7047d..c754ae3 100644 --- a/drivers/target/target_core_device.c +++ b/drivers/target/target_core_device.c @@ -78,12 +78,16 @@ &deve->read_bytes); se_lun = rcu_dereference(deve->se_lun); + + if (!percpu_ref_tryget_live(&se_lun->lun_ref)) { + se_lun = NULL; + goto out_unlock; + } + se_cmd->se_lun = rcu_dereference(deve->se_lun); se_cmd->pr_res_key = deve->pr_res_key; se_cmd->orig_fe_lun = unpacked_lun; se_cmd->se_cmd_flags |= SCF_SE_LUN_CMD; - - percpu_ref_get(&se_lun->lun_ref); se_cmd->lun_ref_active = true; if ((se_cmd->data_direction == DMA_TO_DEVICE) && @@ -97,6 +101,7 @@ goto ref_dev; } } +out_unlock: rcu_read_unlock(); if (!se_lun) { @@ -815,6 +820,7 @@ struct se_device *target_alloc_device(struct se_hba *hba, const char *name) xcopy_lun = &dev->xcopy_lun; rcu_assign_pointer(xcopy_lun->lun_se_dev, dev); init_completion(&xcopy_lun->lun_ref_comp); + init_completion(&xcopy_lun->lun_shutdown_comp); INIT_LIST_HEAD(&xcopy_lun->lun_deve_list); INIT_LIST_HEAD(&xcopy_lun->lun_dev_link); mutex_init(&xcopy_lun->lun_tg_pt_md_mutex); diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c index d99752c..2744251 100644 --- a/drivers/target/target_core_tpg.c +++ b/drivers/target/target_core_tpg.c @@ -445,7 +445,7 @@ static void core_tpg_lun_ref_release(struct percpu_ref *ref) { struct se_lun *lun = container_of(ref, struct se_lun, lun_ref); - complete(&lun->lun_ref_comp); + complete(&lun->lun_shutdown_comp); } int core_tpg_register( @@ -571,6 +571,7 @@ struct se_lun *core_tpg_alloc_lun( lun->lun_link_magic = SE_LUN_LINK_MAGIC; atomic_set(&lun->lun_acl_count, 0); init_completion(&lun->lun_ref_comp); + init_completion(&lun->lun_shutdown_comp); INIT_LIST_HEAD(&lun->lun_deve_list); INIT_LIST_HEAD(&lun->lun_dev_link); atomic_set(&lun->lun_tg_pt_secondary_offline, 0); diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index efb9e6f..434d9d6 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -2700,10 +2700,39 @@ void target_wait_for_sess_cmds(struct se_session *se_sess) } EXPORT_SYMBOL(target_wait_for_sess_cmds); +static void target_lun_confirm(struct percpu_ref *ref) +{ + struct se_lun *lun = container_of(ref, struct se_lun, lun_ref); + + complete(&lun->lun_ref_comp); +} + void transport_clear_lun_ref(struct se_lun *lun) { - percpu_ref_kill(&lun->lun_ref); + /* + * Mark the percpu-ref as DEAD, switch to atomic_t mode, drop + * the initial reference and schedule confirm kill to be + * executed after one full RCU grace period has completed. + */ + percpu_ref_kill_and_confirm(&lun->lun_ref, target_lun_confirm); + /* + * The first completion waits for percpu_ref_switch_to_atomic_rcu() + * to call target_lun_confirm after lun->lun_ref has been marked + * as __PERCPU_REF_DEAD on all CPUs, and switches to atomic_t + * mode so that percpu_ref_tryget_live() lookup of lun->lun_ref + * fails for all new incoming I/O. + */ wait_for_completion(&lun->lun_ref_comp); + /* + * The second completion waits for percpu_ref_put_many() to + * invoke ->release() after lun->lun_ref has switched to + * atomic_t mode, and lun->lun_ref.count has reached zero. + * + * At this point all target-core lun->lun_ref references have + * been dropped via transport_lun_remove_cmd(), and it's safe + * to proceed with the remaining LUN shutdown. + */ + wait_for_completion(&lun->lun_shutdown_comp); } static bool diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h index d7336f3..16d3be8 100644 --- a/include/target/target_core_base.h +++ b/include/target/target_core_base.h @@ -730,6 +730,7 @@ struct se_lun { struct config_group lun_group; struct se_port_stat_grps port_stat_grps; struct completion lun_ref_comp; + struct completion lun_shutdown_comp; struct percpu_ref lun_ref; struct list_head lun_dev_link; struct hlist_node link;