Message ID | 20181227190842.GA19565@myunghoj-Precision-5530 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | libceph: protect pending flags in ceph_con_keepalive() | expand |
On Thu, Dec 27, 2018 at 8:08 PM Myungho Jung <mhjungk@gmail.com> wrote: > > con_flag_test_and_set() sets CON_FLAG_KEEPALIVE_PENDING and > CON_FLAG_WRITE_PENDING flags without protection in ceph_con_keepalive(). > It triggers WARN_ON() in clear_standby() if the flags are set after > con_fault() changes connection state to CON_STATE_STANDBY. Move > con_flag_test_and_set() to be called before releasing the lock and store > the condition to check after the critical section. > > Reported-by: syzbot+acdeb633f6211ccdf886@syzkaller.appspotmail.com > Signed-off-by: Myungho Jung <mhjungk@gmail.com> > --- > net/ceph/messenger.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index 2f126eff275d..e15da22d4f37 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -3216,12 +3216,16 @@ void ceph_msg_revoke_incoming(struct ceph_msg *msg) > */ > void ceph_con_keepalive(struct ceph_connection *con) > { > + bool pending; > + > dout("con_keepalive %p\n", con); > mutex_lock(&con->mutex); > clear_standby(con); > + pending = (con_flag_test_and_set(con, > + CON_FLAG_KEEPALIVE_PENDING) == 0 && > + con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0); > mutex_unlock(&con->mutex); > - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && > - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > + if (pending) > queue_con(con); > } > EXPORT_SYMBOL(ceph_con_keepalive); Hi Myungho, Were you able to reproduce? If so, did you use the syzkaller output or something else? Thanks, Ilya
On Wed, Jan 02, 2019 at 04:42:47PM +0100, Ilya Dryomov wrote: > On Thu, Dec 27, 2018 at 8:08 PM Myungho Jung <mhjungk@gmail.com> wrote: > > > > con_flag_test_and_set() sets CON_FLAG_KEEPALIVE_PENDING and > > CON_FLAG_WRITE_PENDING flags without protection in ceph_con_keepalive(). > > It triggers WARN_ON() in clear_standby() if the flags are set after > > con_fault() changes connection state to CON_STATE_STANDBY. Move > > con_flag_test_and_set() to be called before releasing the lock and store > > the condition to check after the critical section. > > > > Reported-by: syzbot+acdeb633f6211ccdf886@syzkaller.appspotmail.com > > Signed-off-by: Myungho Jung <mhjungk@gmail.com> > > --- > > net/ceph/messenger.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > > index 2f126eff275d..e15da22d4f37 100644 > > --- a/net/ceph/messenger.c > > +++ b/net/ceph/messenger.c > > @@ -3216,12 +3216,16 @@ void ceph_msg_revoke_incoming(struct ceph_msg *msg) > > */ > > void ceph_con_keepalive(struct ceph_connection *con) > > { > > + bool pending; > > + > > dout("con_keepalive %p\n", con); > > mutex_lock(&con->mutex); > > clear_standby(con); > > + pending = (con_flag_test_and_set(con, > > + CON_FLAG_KEEPALIVE_PENDING) == 0 && > > + con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0); > > mutex_unlock(&con->mutex); > > - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && > > - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > > + if (pending) > > queue_con(con); > > } > > EXPORT_SYMBOL(ceph_con_keepalive); > > Hi Myungho, > > Were you able to reproduce? If so, did you use the syzkaller output or > something else? > > Thanks, > > Ilya Hi Ilya, I reproduced on vm using syzkaller utils and verified the fix by syzbot. Thanks, Myungho
On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung <mhjungk@gmail.com> wrote:
> I reproduced on vm using syzkaller utils and verified the fix by syzbot.
Hi Myungho,
I think this might be a better fix:
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d5718284db57..c5f5313e3537 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con)
{
dout("con_keepalive %p\n", con);
mutex_lock(&con->mutex);
+ con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING);
clear_standby(con);
mutex_unlock(&con->mutex);
- if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 &&
- con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
+
+ if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
queue_con(con);
}
EXPORT_SYMBOL(ceph_con_keepalive);
WRITE_PENDING can be set without con->mutex held from socket callbacks.
This is the reason we use atomic bit ops here, so testing WRITE_PENDING
under the lock didn't make sense to me.
At the same time, KEEPALIVE_PENDING could have been a non-atomic flag.
I spent some time trying to make sense of conditioning queue_con() call
on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm
setting it with con_flag_set(), making ceph_con_keepalive() symmetric
with ceph_con_send().
Thanks,
Ilya
On Mon, Jan 14, 2019 at 09:37:25PM +0100, Ilya Dryomov wrote: > On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung <mhjungk@gmail.com> wrote: > > I reproduced on vm using syzkaller utils and verified the fix by syzbot. > > Hi Myungho, > > I think this might be a better fix: > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index d5718284db57..c5f5313e3537 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con) > { > dout("con_keepalive %p\n", con); > mutex_lock(&con->mutex); > + con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING); > clear_standby(con); > mutex_unlock(&con->mutex); > - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && > - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > + > + if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > queue_con(con); > } > EXPORT_SYMBOL(ceph_con_keepalive); > > WRITE_PENDING can be set without con->mutex held from socket callbacks. > This is the reason we use atomic bit ops here, so testing WRITE_PENDING > under the lock didn't make sense to me. > > At the same time, KEEPALIVE_PENDING could have been a non-atomic flag. > I spent some time trying to make sense of conditioning queue_con() call > on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm > setting it with con_flag_set(), making ceph_con_keepalive() symmetric > with ceph_con_send(). > > Thanks, > > Ilya Hi Ilya, Yes, it looks clear and makes sense to have an atomic operation in if statement but it still triggers warning. KEEPALIVE_PENDING should be set after clear_standby() because con_fault() can be called right before acquiring the lock here which sets the flag in standby state. I tesed the change with syzbot and confirmed there was no warning. Thanks, Myungho
On Tue, Jan 15, 2019 at 7:56 AM Myungho Jung <mhjungk@gmail.com> wrote: > > On Mon, Jan 14, 2019 at 09:37:25PM +0100, Ilya Dryomov wrote: > > On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung <mhjungk@gmail.com> wrote: > > > I reproduced on vm using syzkaller utils and verified the fix by syzbot. > > > > Hi Myungho, > > > > I think this might be a better fix: > > > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > > index d5718284db57..c5f5313e3537 100644 > > --- a/net/ceph/messenger.c > > +++ b/net/ceph/messenger.c > > @@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con) > > { > > dout("con_keepalive %p\n", con); > > mutex_lock(&con->mutex); > > + con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING); > > clear_standby(con); > > mutex_unlock(&con->mutex); > > - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && > > - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > > + > > + if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > > queue_con(con); > > } > > EXPORT_SYMBOL(ceph_con_keepalive); > > > > WRITE_PENDING can be set without con->mutex held from socket callbacks. > > This is the reason we use atomic bit ops here, so testing WRITE_PENDING > > under the lock didn't make sense to me. > > > > At the same time, KEEPALIVE_PENDING could have been a non-atomic flag. > > I spent some time trying to make sense of conditioning queue_con() call > > on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm > > setting it with con_flag_set(), making ceph_con_keepalive() symmetric > > with ceph_con_send(). > > > > Thanks, > > > > Ilya > > Hi Ilya, > > Yes, it looks clear and makes sense to have an atomic operation in if statement > but it still triggers warning. KEEPALIVE_PENDING should be set after > clear_standby() because con_fault() can be called right before acquiring the > lock here which sets the flag in standby state. I tesed the change with syzbot > and confirmed there was no warning. Right, it still triggers one of the warnings. I was too focused on WRITE_PENDING and missed that in plain sight. I'll update the patch. Thanks for testing! Ilya
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 2f126eff275d..e15da22d4f37 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -3216,12 +3216,16 @@ void ceph_msg_revoke_incoming(struct ceph_msg *msg) */ void ceph_con_keepalive(struct ceph_connection *con) { + bool pending; + dout("con_keepalive %p\n", con); mutex_lock(&con->mutex); clear_standby(con); + pending = (con_flag_test_and_set(con, + CON_FLAG_KEEPALIVE_PENDING) == 0 && + con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0); mutex_unlock(&con->mutex); - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) + if (pending) queue_con(con); } EXPORT_SYMBOL(ceph_con_keepalive);
con_flag_test_and_set() sets CON_FLAG_KEEPALIVE_PENDING and CON_FLAG_WRITE_PENDING flags without protection in ceph_con_keepalive(). It triggers WARN_ON() in clear_standby() if the flags are set after con_fault() changes connection state to CON_STATE_STANDBY. Move con_flag_test_and_set() to be called before releasing the lock and store the condition to check after the critical section. Reported-by: syzbot+acdeb633f6211ccdf886@syzkaller.appspotmail.com Signed-off-by: Myungho Jung <mhjungk@gmail.com> --- net/ceph/messenger.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)