Message ID | 59FAB46C.9050703@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Alex, For local heartbeat mode, I understand that if the region is not found, o2hb_live_lock won't be released. And if the region is found, it doesn't have to iterate the next region. So I agree with your fix in this case. But I still don't get how to make sure the safe iteration in case of global heartbeat mode. Thanks, Joseph On 17/11/2 14:00, alex chen wrote: > In the following situation, the down_write() will be called under > the spin_lock(), which may lead a soft lockup: > o2hb_region_inc_user > spin_lock(&o2hb_live_lock) > o2hb_region_pin > o2nm_depend_item > configfs_depend_item > inode_lock > down_write > -->here may sleep and reschedule > > So we should unlock the o2hb_live_lock before the o2nm_depend_item(), and > get item reference in advance to prevent the region to be released. > > Signed-off-by: Alex Chen <alex.chen@huawei.com> > Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> > Reviewed-by: Jun Piao <piaojun@huawei.com> > --- > fs/ocfs2/cluster/heartbeat.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c > index d020604..07b2fdc 100644 > --- a/fs/ocfs2/cluster/heartbeat.c > +++ b/fs/ocfs2/cluster/heartbeat.c > @@ -2399,8 +2399,15 @@ static int o2hb_region_pin(const char *region_uuid) > if (reg->hr_item_pinned || reg->hr_item_dropped) > goto skip_pin; > > + config_item_get(®->hr_item); > + spin_unlock(&o2hb_live_lock); > + > /* Ignore ENOENT only for local hb (userdlm domain) */ > ret = o2nm_depend_item(®->hr_item); > + > + config_item_put(®->hr_item); > + spin_lock(&o2hb_live_lock); > + > if (!ret) { > mlog(ML_CLUSTER, "Pin region %s\n", uuid); > reg->hr_item_pinned = 1; >
Hi Joseph, Thanks for your reply. For the global heartbeat mode, the concurrently access to reg->hr_item_pinned may be happened after unlocking. I will modify this patch and send the patch v3. Thanks, Alex On 2017/11/6 17:37, Joseph Qi wrote: > Hi Alex, > > For local heartbeat mode, I understand that if the region is not found, > o2hb_live_lock won't be released. And if the region is found, it doesn't > have to iterate the next region. So I agree with your fix in this case. > > But I still don't get how to make sure the safe iteration in case of > global heartbeat mode. > > Thanks, > Joseph > > On 17/11/2 14:00, alex chen wrote: >> In the following situation, the down_write() will be called under >> the spin_lock(), which may lead a soft lockup: >> o2hb_region_inc_user >> spin_lock(&o2hb_live_lock) >> o2hb_region_pin >> o2nm_depend_item >> configfs_depend_item >> inode_lock >> down_write >> -->here may sleep and reschedule >> >> So we should unlock the o2hb_live_lock before the o2nm_depend_item(), and >> get item reference in advance to prevent the region to be released. >> >> Signed-off-by: Alex Chen <alex.chen@huawei.com> >> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> >> Reviewed-by: Jun Piao <piaojun@huawei.com> >> --- >> fs/ocfs2/cluster/heartbeat.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c >> index d020604..07b2fdc 100644 >> --- a/fs/ocfs2/cluster/heartbeat.c >> +++ b/fs/ocfs2/cluster/heartbeat.c >> @@ -2399,8 +2399,15 @@ static int o2hb_region_pin(const char *region_uuid) >> if (reg->hr_item_pinned || reg->hr_item_dropped) >> goto skip_pin; >> >> + config_item_get(®->hr_item); >> + spin_unlock(&o2hb_live_lock); >> + >> /* Ignore ENOENT only for local hb (userdlm domain) */ >> ret = o2nm_depend_item(®->hr_item); >> + >> + config_item_put(®->hr_item); >> + spin_lock(&o2hb_live_lock); >> + >> if (!ret) { >> mlog(ML_CLUSTER, "Pin region %s\n", uuid); >> reg->hr_item_pinned = 1; >>
diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c index d020604..07b2fdc 100644 --- a/fs/ocfs2/cluster/heartbeat.c +++ b/fs/ocfs2/cluster/heartbeat.c @@ -2399,8 +2399,15 @@ static int o2hb_region_pin(const char *region_uuid) if (reg->hr_item_pinned || reg->hr_item_dropped) goto skip_pin; + config_item_get(®->hr_item); + spin_unlock(&o2hb_live_lock); + /* Ignore ENOENT only for local hb (userdlm domain) */ ret = o2nm_depend_item(®->hr_item); + + config_item_put(®->hr_item); + spin_lock(&o2hb_live_lock); + if (!ret) { mlog(ML_CLUSTER, "Pin region %s\n", uuid); reg->hr_item_pinned = 1;