Message ID | 1370315998-10418-10-git-send-email-zheng.z.yan@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Yan- On Tue, 4 Jun 2013, Yan, Zheng wrote: > From: "Yan, Zheng" <zheng.z.yan@intel.com> > > Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> > --- > fs/ceph/caps.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > index 790f88b..458a66e 100644 > --- a/fs/ceph/caps.c > +++ b/fs/ceph/caps.c > @@ -1982,8 +1982,14 @@ static void kick_flushing_inode_caps(struct ceph_mds_client *mdsc, > cap = ci->i_auth_cap; > dout("kick_flushing_inode_caps %p flushing %s flush_seq %lld\n", inode, > ceph_cap_string(ci->i_flushing_caps), ci->i_cap_flush_seq); > + > __ceph_flush_snaps(ci, &session, 1); This function does funny things to the local session pointer... did you consider this when using it below? It can change to the auth cap mds if it is different than the value passed in... > + > if (ci->i_flushing_caps) { > + spin_lock(&mdsc->cap_dirty_lock); > + list_move_tail(&ci->i_flushing_item, &session->s_cap_flushing); > + spin_unlock(&mdsc->cap_dirty_lock); > + > delayed = __send_cap(mdsc, cap, CEPH_CAP_OP_FLUSH, > __ceph_caps_used(ci), > __ceph_caps_wanted(ci), > -- > 1.8.1.4 > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 10 Jun 2013, Sage Weil wrote: > Hi Yan- > > On Tue, 4 Jun 2013, Yan, Zheng wrote: > > > From: "Yan, Zheng" <zheng.z.yan@intel.com> > > > > Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> > > --- > > fs/ceph/caps.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > index 790f88b..458a66e 100644 > > --- a/fs/ceph/caps.c > > +++ b/fs/ceph/caps.c > > @@ -1982,8 +1982,14 @@ static void kick_flushing_inode_caps(struct ceph_mds_client *mdsc, > > cap = ci->i_auth_cap; > > dout("kick_flushing_inode_caps %p flushing %s flush_seq %lld\n", inode, > > ceph_cap_string(ci->i_flushing_caps), ci->i_cap_flush_seq); > > + > > __ceph_flush_snaps(ci, &session, 1); > > This function does funny things to the local session pointer... did you > consider this when using it below? It can change to the auth cap mds if > it is different than the value passed in... I wonder if we screwed something up here, but I just got a crash inside remove_session_caps() that might be explained by a corrupt list. I don't think I've seen this before.. 0xffff880214aabf20 753 2 1 3 R 0xffff880214aac3a8 *kworker/3:2 ffff880224a33ae8 0000000000000018 ffffffffa0814d63 ffff880224f85800 ffff88020b277790 ffff880224f85800 ffff88020c04e800 ffff880224a33c08 ffffffffa081a1cf ffffffffffffffff ffff880224a33fd8 ffffffffffffffff Call Trace: [<ffffffffa0814d63>] ? remove_session_caps+0x33/0x140 [ceph] [<ffffffffa081a1cf>] ? dispatch+0x7ff/0x1740 [ceph] [<ffffffff81510b66>] ? kernel_recvmsg+0x46/0x60 [<ffffffffa07c4e38>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph] [<ffffffff810a317d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffa07c81f8>] ? con_work+0x1948/0x2d50 [libceph] [<ffffffff81080c93>] ? idle_balance+0x133/0x180 [<ffffffff81071c58>] ? finish_task_switch+0x48/0x110 [<ffffffff81071c58>] ? finish_task_switch+0x48/0x110 [<ffffffff8105f44f>] ? process_one_work+0x16f/0x540 [<ffffffff8105f4ba>] ? process_one_work+0x1da/0x540 [<ffffffff8105f44f>] ? process_one_work+0x16f/0x540 [<ffffffff8106069c>] ? worker_thread+0x11c/0x370 [<ffffffff81060580>] ? manage_workers.isra.20+0x2e0/0x2e0 [<ffffffff8106735a>] ? kthread+0xea/0xf0 > > > + > > if (ci->i_flushing_caps) { > > + spin_lock(&mdsc->cap_dirty_lock); > > + list_move_tail(&ci->i_flushing_item, &session->s_cap_flushing); > > + spin_unlock(&mdsc->cap_dirty_lock); > > + > > delayed = __send_cap(mdsc, cap, CEPH_CAP_OP_FLUSH, > > __ceph_caps_used(ci), > > __ceph_caps_wanted(ci), > > -- > > 1.8.1.4 > > > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/11/2013 02:17 PM, Sage Weil wrote: > On Mon, 10 Jun 2013, Sage Weil wrote: >> Hi Yan- >> >> On Tue, 4 Jun 2013, Yan, Zheng wrote: >> >>> From: "Yan, Zheng" <zheng.z.yan@intel.com> >>> >>> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> >>> --- >>> fs/ceph/caps.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c >>> index 790f88b..458a66e 100644 >>> --- a/fs/ceph/caps.c >>> +++ b/fs/ceph/caps.c >>> @@ -1982,8 +1982,14 @@ static void kick_flushing_inode_caps(struct ceph_mds_client *mdsc, >>> cap = ci->i_auth_cap; >>> dout("kick_flushing_inode_caps %p flushing %s flush_seq %lld\n", inode, >>> ceph_cap_string(ci->i_flushing_caps), ci->i_cap_flush_seq); >>> + >>> __ceph_flush_snaps(ci, &session, 1); >> >> This function does funny things to the local session pointer... did you >> consider this when using it below? It can change to the auth cap mds if >> it is different than the value passed in... > I didn't realize that. But even take it into consideration, I still don't understand how the list gets corrupt. Did you use snapshot? how many active MDS?. > I wonder if we screwed something up here, but I just got a crash inside > remove_session_caps() that might be explained by a corrupt list. I don't > think I've seen this before.. BUG_ON(session->s_nr_caps > 0) or BUG_ON(!list_empty(&session->s_cap_flushing)) ? and why the kclient receives CEPH_SESSION_CLOSE message ? Regards Yan, Zheng > > 0xffff880214aabf20 753 2 1 3 R 0xffff880214aac3a8 > *kworker/3:2 > ffff880224a33ae8 0000000000000018 ffffffffa0814d63 ffff880224f85800 > ffff88020b277790 ffff880224f85800 ffff88020c04e800 ffff880224a33c08 > ffffffffa081a1cf ffffffffffffffff ffff880224a33fd8 ffffffffffffffff > Call Trace: > [<ffffffffa0814d63>] ? remove_session_caps+0x33/0x140 [ceph] > [<ffffffffa081a1cf>] ? dispatch+0x7ff/0x1740 [ceph] > [<ffffffff81510b66>] ? kernel_recvmsg+0x46/0x60 > [<ffffffffa07c4e38>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph] > [<ffffffff810a317d>] ? trace_hardirqs_on+0xd/0x10 > [<ffffffffa07c81f8>] ? con_work+0x1948/0x2d50 [libceph] > [<ffffffff81080c93>] ? idle_balance+0x133/0x180 > [<ffffffff81071c58>] ? finish_task_switch+0x48/0x110 > [<ffffffff81071c58>] ? finish_task_switch+0x48/0x110 > [<ffffffff8105f44f>] ? process_one_work+0x16f/0x540 > [<ffffffff8105f4ba>] ? process_one_work+0x1da/0x540 > [<ffffffff8105f44f>] ? process_one_work+0x16f/0x540 > [<ffffffff8106069c>] ? worker_thread+0x11c/0x370 > [<ffffffff81060580>] ? manage_workers.isra.20+0x2e0/0x2e0 > [<ffffffff8106735a>] ? kthread+0xea/0xf0 > > > >> >>> + >>> if (ci->i_flushing_caps) { >>> + spin_lock(&mdsc->cap_dirty_lock); >>> + list_move_tail(&ci->i_flushing_item, &session->s_cap_flushing); >>> + spin_unlock(&mdsc->cap_dirty_lock); >>> + >>> delayed = __send_cap(mdsc, cap, CEPH_CAP_OP_FLUSH, >>> __ceph_caps_used(ci), >>> __ceph_caps_wanted(ci), >>> -- >>> 1.8.1.4 >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 790f88b..458a66e 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1982,8 +1982,14 @@ static void kick_flushing_inode_caps(struct ceph_mds_client *mdsc, cap = ci->i_auth_cap; dout("kick_flushing_inode_caps %p flushing %s flush_seq %lld\n", inode, ceph_cap_string(ci->i_flushing_caps), ci->i_cap_flush_seq); + __ceph_flush_snaps(ci, &session, 1); + if (ci->i_flushing_caps) { + spin_lock(&mdsc->cap_dirty_lock); + list_move_tail(&ci->i_flushing_item, &session->s_cap_flushing); + spin_unlock(&mdsc->cap_dirty_lock); + delayed = __send_cap(mdsc, cap, CEPH_CAP_OP_FLUSH, __ceph_caps_used(ci), __ceph_caps_wanted(ci),