Message ID | 1527673416-31268-5-git-send-email-lidongchen@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Lidong Chen (jemmy858585@gmail.com) wrote: > From: Lidong Chen <jemmy858585@gmail.com> > > The channel_close maybe invoked by different threads. For example, source > qemu invokes qemu_fclose in main thread, migration thread and return path > thread. Destination qemu invokes qemu_fclose in main thread, listen thread > and COLO incoming thread. > > Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close. > > Signed-off-by: Lidong Chen <lidongchen@tencent.com> > --- > migration/qemu-file.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > index 977b9ae..87d0f05 100644 > --- a/migration/qemu-file.c > +++ b/migration/qemu-file.c > @@ -52,6 +52,7 @@ struct QEMUFile { > unsigned int iovcnt; > > int last_error; > + QemuMutex lock; That could do with a comment saying what you're protecting > }; > > /* > @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) > > f = g_new0(QEMUFile, 1); > > + qemu_mutex_init(&f->lock); > f->opaque = opaque; > f->ops = ops; > return f; > @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) > ret = qemu_file_get_error(f); > > if (f->ops->close) { > + qemu_mutex_lock(&f->lock); > int ret2 = f->ops->close(f->opaque); > + qemu_mutex_unlock(&f->lock); OK, and at least for the RDMA code, if it calls close a 2nd time, rioc->rdma is checked so it wont actually free stuff a 2nd time. > if (ret >= 0) { > ret = ret2; > } > @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) > if (f->last_error) { > ret = f->last_error; > } > + qemu_mutex_destroy(&f->lock); > g_free(f); Hmm but that's not safe; if two things really do call qemu_fclose() on the same structure they race here and can end up destroying the lock twice, or doing f->lock after the 1st one has already g_free(f). So lets go back a step. I think: a) There should always be a separate QEMUFile* for to_src_file and from_src_file - I don't see where you open the 2nd one; I don't see your implementation of f->ops->get_return_path. b) I *think* that while the different threads might all call fclose(), I think there should only ever be one qemu_fclose call for each direction on the QEMUFile. But now we have two problems: If (a) is true then f->lock is separate on each one so doesn't really protect if the two directions are closed at once. (Assuming (b) is true) If (a) is false and we actually share a single QEMUFile then that race at the end happens. Dave > trace_qemu_file_fclose(); > return ret; > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Wed, May 30, 2018 at 10:45 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > * Lidong Chen (jemmy858585@gmail.com) wrote: >> From: Lidong Chen <jemmy858585@gmail.com> >> >> The channel_close maybe invoked by different threads. For example, source >> qemu invokes qemu_fclose in main thread, migration thread and return path >> thread. Destination qemu invokes qemu_fclose in main thread, listen thread >> and COLO incoming thread. >> >> Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close. >> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com> >> --- >> migration/qemu-file.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c >> index 977b9ae..87d0f05 100644 >> --- a/migration/qemu-file.c >> +++ b/migration/qemu-file.c >> @@ -52,6 +52,7 @@ struct QEMUFile { >> unsigned int iovcnt; >> >> int last_error; >> + QemuMutex lock; > > That could do with a comment saying what you're protecting > >> }; >> >> /* >> @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) >> >> f = g_new0(QEMUFile, 1); >> >> + qemu_mutex_init(&f->lock); >> f->opaque = opaque; >> f->ops = ops; >> return f; >> @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) >> ret = qemu_file_get_error(f); >> >> if (f->ops->close) { >> + qemu_mutex_lock(&f->lock); >> int ret2 = f->ops->close(f->opaque); >> + qemu_mutex_unlock(&f->lock); > > OK, and at least for the RDMA code, if it calls > close a 2nd time, rioc->rdma is checked so it wont actually free stuff a > 2nd time. > >> if (ret >= 0) { >> ret = ret2; >> } >> @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) >> if (f->last_error) { >> ret = f->last_error; >> } >> + qemu_mutex_destroy(&f->lock); >> g_free(f); > > Hmm but that's not safe; if two things really do call qemu_fclose() > on the same structure they race here and can end up destroying the lock > twice, or doing f->lock after the 1st one has already g_free(f). > > > So lets go back a step. > I think: > a) There should always be a separate QEMUFile* for > to_src_file and from_src_file - I don't see where you open > the 2nd one; I don't see your implementation of > f->ops->get_return_path. yes, current qemu version use a separate QEMUFile* for to_src_file and from_src_file. and the two QEMUFile point to one QIOChannelRDMA. the f->ops->get_return_path is implemented by channel_output_ops or channel_input_ops. > b) I *think* that while the different threads might all call > fclose(), I think there should only ever be one qemu_fclose > call for each direction on the QEMUFile. > > But now we have two problems: > If (a) is true then f->lock is separate on each one so > doesn't really protect if the two directions are closed > at once. (Assuming (b) is true) yes, you are right. so I should add a QemuMutex in QIOChannel structure, not QEMUFile structure. and qemu_mutex_destroy the QemuMutex in qio_channel_finalize. Thank you. > > If (a) is false and we actually share a single QEMUFile then > that race at the end happens. > > Dave > > >> trace_qemu_file_fclose(); >> return ret; >> -- >> 1.8.3.1 >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
* 858585 jemmy (jemmy858585@gmail.com) wrote: > On Wed, May 30, 2018 at 10:45 PM, Dr. David Alan Gilbert > <dgilbert@redhat.com> wrote: > > * Lidong Chen (jemmy858585@gmail.com) wrote: > >> From: Lidong Chen <jemmy858585@gmail.com> > >> > >> The channel_close maybe invoked by different threads. For example, source > >> qemu invokes qemu_fclose in main thread, migration thread and return path > >> thread. Destination qemu invokes qemu_fclose in main thread, listen thread > >> and COLO incoming thread. > >> > >> Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close. > >> > >> Signed-off-by: Lidong Chen <lidongchen@tencent.com> > >> --- > >> migration/qemu-file.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c > >> index 977b9ae..87d0f05 100644 > >> --- a/migration/qemu-file.c > >> +++ b/migration/qemu-file.c > >> @@ -52,6 +52,7 @@ struct QEMUFile { > >> unsigned int iovcnt; > >> > >> int last_error; > >> + QemuMutex lock; > > > > That could do with a comment saying what you're protecting > > > >> }; > >> > >> /* > >> @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) > >> > >> f = g_new0(QEMUFile, 1); > >> > >> + qemu_mutex_init(&f->lock); > >> f->opaque = opaque; > >> f->ops = ops; > >> return f; > >> @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) > >> ret = qemu_file_get_error(f); > >> > >> if (f->ops->close) { > >> + qemu_mutex_lock(&f->lock); > >> int ret2 = f->ops->close(f->opaque); > >> + qemu_mutex_unlock(&f->lock); > > > > OK, and at least for the RDMA code, if it calls > > close a 2nd time, rioc->rdma is checked so it wont actually free stuff a > > 2nd time. > > > >> if (ret >= 0) { > >> ret = ret2; > >> } > >> @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) > >> if (f->last_error) { > >> ret = f->last_error; > >> } > >> + qemu_mutex_destroy(&f->lock); > >> g_free(f); > > > > Hmm but that's not safe; if two things really do call qemu_fclose() > > on the same structure they race here and can end up destroying the lock > > twice, or doing f->lock after the 1st one has already g_free(f). > > > > > > > So lets go back a step. > > I think: > > a) There should always be a separate QEMUFile* for > > to_src_file and from_src_file - I don't see where you open > > the 2nd one; I don't see your implementation of > > f->ops->get_return_path. > > yes, current qemu version use a separate QEMUFile* for to_src_file and > from_src_file. > and the two QEMUFile point to one QIOChannelRDMA. > > the f->ops->get_return_path is implemented by channel_output_ops or > channel_input_ops. Ah OK, yes that makes sense. > > b) I *think* that while the different threads might all call > > fclose(), I think there should only ever be one qemu_fclose > > call for each direction on the QEMUFile. > > > > But now we have two problems: > > If (a) is true then f->lock is separate on each one so > > doesn't really protect if the two directions are closed > > at once. (Assuming (b) is true) > > yes, you are right. so I should add a QemuMutex in QIOChannel structure, not > QEMUFile structure. and qemu_mutex_destroy the QemuMutex in > qio_channel_finalize. OK, that sounds better. Dave > Thank you. > > > > > If (a) is false and we actually share a single QEMUFile then > > that race at the end happens. > > > > Dave > > > > > >> trace_qemu_file_fclose(); > >> return ret; > >> -- > >> 1.8.3.1 > >> > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Thu, May 31, 2018 at 6:52 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > * 858585 jemmy (jemmy858585@gmail.com) wrote: >> On Wed, May 30, 2018 at 10:45 PM, Dr. David Alan Gilbert >> <dgilbert@redhat.com> wrote: >> > * Lidong Chen (jemmy858585@gmail.com) wrote: >> >> From: Lidong Chen <jemmy858585@gmail.com> >> >> >> >> The channel_close maybe invoked by different threads. For example, source >> >> qemu invokes qemu_fclose in main thread, migration thread and return path >> >> thread. Destination qemu invokes qemu_fclose in main thread, listen thread >> >> and COLO incoming thread. >> >> >> >> Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close. >> >> >> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com> >> >> --- >> >> migration/qemu-file.c | 5 +++++ >> >> 1 file changed, 5 insertions(+) >> >> >> >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c >> >> index 977b9ae..87d0f05 100644 >> >> --- a/migration/qemu-file.c >> >> +++ b/migration/qemu-file.c >> >> @@ -52,6 +52,7 @@ struct QEMUFile { >> >> unsigned int iovcnt; >> >> >> >> int last_error; >> >> + QemuMutex lock; >> > >> > That could do with a comment saying what you're protecting >> > >> >> }; >> >> >> >> /* >> >> @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) >> >> >> >> f = g_new0(QEMUFile, 1); >> >> >> >> + qemu_mutex_init(&f->lock); >> >> f->opaque = opaque; >> >> f->ops = ops; >> >> return f; >> >> @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) >> >> ret = qemu_file_get_error(f); >> >> >> >> if (f->ops->close) { >> >> + qemu_mutex_lock(&f->lock); >> >> int ret2 = f->ops->close(f->opaque); >> >> + qemu_mutex_unlock(&f->lock); >> > >> > OK, and at least for the RDMA code, if it calls >> > close a 2nd time, rioc->rdma is checked so it wont actually free stuff a >> > 2nd time. >> > >> >> if (ret >= 0) { >> >> ret = ret2; >> >> } >> >> @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) >> >> if (f->last_error) { >> >> ret = f->last_error; >> >> } >> >> + qemu_mutex_destroy(&f->lock); >> >> g_free(f); >> > >> > Hmm but that's not safe; if two things really do call qemu_fclose() >> > on the same structure they race here and can end up destroying the lock >> > twice, or doing f->lock after the 1st one has already g_free(f). >> >> > >> > >> > So lets go back a step. >> > I think: >> > a) There should always be a separate QEMUFile* for >> > to_src_file and from_src_file - I don't see where you open >> > the 2nd one; I don't see your implementation of >> > f->ops->get_return_path. >> >> yes, current qemu version use a separate QEMUFile* for to_src_file and >> from_src_file. >> and the two QEMUFile point to one QIOChannelRDMA. >> >> the f->ops->get_return_path is implemented by channel_output_ops or >> channel_input_ops. > > Ah OK, yes that makes sense. > >> > b) I *think* that while the different threads might all call >> > fclose(), I think there should only ever be one qemu_fclose >> > call for each direction on the QEMUFile. >> > >> > But now we have two problems: >> > If (a) is true then f->lock is separate on each one so >> > doesn't really protect if the two directions are closed >> > at once. (Assuming (b) is true) >> >> yes, you are right. so I should add a QemuMutex in QIOChannel structure, not >> QEMUFile structure. and qemu_mutex_destroy the QemuMutex in >> qio_channel_finalize. > > OK, that sounds better. > > Dave > Hi Dave: Another way is protect channel_close in migration part, like QemuMutex rp_mutex. As Daniel mentioned, QIOChannel impls are only intended to a single thread. https://www.mail-archive.com/qemu-devel@nongnu.org/msg530100.html which way is better? Does QIOChannel have the plan to support multi thread? Not only channel_close need lock between different threads, writev_buffer write also need. thanks. >> Thank you. >> >> > >> > If (a) is false and we actually share a single QEMUFile then >> > that race at the end happens. >> > >> > Dave >> > >> > >> >> trace_qemu_file_fclose(); >> >> return ret; >> >> -- >> >> 1.8.3.1 >> >> >> > -- >> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Sun, Jun 3, 2018 at 9:50 PM, 858585 jemmy <jemmy858585@gmail.com> wrote: > On Thu, May 31, 2018 at 6:52 PM, Dr. David Alan Gilbert > <dgilbert@redhat.com> wrote: >> * 858585 jemmy (jemmy858585@gmail.com) wrote: >>> On Wed, May 30, 2018 at 10:45 PM, Dr. David Alan Gilbert >>> <dgilbert@redhat.com> wrote: >>> > * Lidong Chen (jemmy858585@gmail.com) wrote: >>> >> From: Lidong Chen <jemmy858585@gmail.com> >>> >> >>> >> The channel_close maybe invoked by different threads. For example, source >>> >> qemu invokes qemu_fclose in main thread, migration thread and return path >>> >> thread. Destination qemu invokes qemu_fclose in main thread, listen thread >>> >> and COLO incoming thread. >>> >> >>> >> Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close. >>> >> >>> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com> >>> >> --- >>> >> migration/qemu-file.c | 5 +++++ >>> >> 1 file changed, 5 insertions(+) >>> >> >>> >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c >>> >> index 977b9ae..87d0f05 100644 >>> >> --- a/migration/qemu-file.c >>> >> +++ b/migration/qemu-file.c >>> >> @@ -52,6 +52,7 @@ struct QEMUFile { >>> >> unsigned int iovcnt; >>> >> >>> >> int last_error; >>> >> + QemuMutex lock; >>> > >>> > That could do with a comment saying what you're protecting >>> > >>> >> }; >>> >> >>> >> /* >>> >> @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) >>> >> >>> >> f = g_new0(QEMUFile, 1); >>> >> >>> >> + qemu_mutex_init(&f->lock); >>> >> f->opaque = opaque; >>> >> f->ops = ops; >>> >> return f; >>> >> @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) >>> >> ret = qemu_file_get_error(f); >>> >> >>> >> if (f->ops->close) { >>> >> + qemu_mutex_lock(&f->lock); >>> >> int ret2 = f->ops->close(f->opaque); >>> >> + qemu_mutex_unlock(&f->lock); >>> > >>> > OK, and at least for the RDMA code, if it calls >>> > close a 2nd time, rioc->rdma is checked so it wont actually free stuff a >>> > 2nd time. >>> > >>> >> if (ret >= 0) { >>> >> ret = ret2; >>> >> } >>> >> @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) >>> >> if (f->last_error) { >>> >> ret = f->last_error; >>> >> } >>> >> + qemu_mutex_destroy(&f->lock); >>> >> g_free(f); >>> > >>> > Hmm but that's not safe; if two things really do call qemu_fclose() >>> > on the same structure they race here and can end up destroying the lock >>> > twice, or doing f->lock after the 1st one has already g_free(f). >>> >>> > >>> > >>> > So lets go back a step. >>> > I think: >>> > a) There should always be a separate QEMUFile* for >>> > to_src_file and from_src_file - I don't see where you open >>> > the 2nd one; I don't see your implementation of >>> > f->ops->get_return_path. >>> >>> yes, current qemu version use a separate QEMUFile* for to_src_file and >>> from_src_file. >>> and the two QEMUFile point to one QIOChannelRDMA. >>> >>> the f->ops->get_return_path is implemented by channel_output_ops or >>> channel_input_ops. >> >> Ah OK, yes that makes sense. >> >>> > b) I *think* that while the different threads might all call >>> > fclose(), I think there should only ever be one qemu_fclose >>> > call for each direction on the QEMUFile. >>> > >>> > But now we have two problems: >>> > If (a) is true then f->lock is separate on each one so >>> > doesn't really protect if the two directions are closed >>> > at once. (Assuming (b) is true) >>> >>> yes, you are right. so I should add a QemuMutex in QIOChannel structure, not >>> QEMUFile structure. and qemu_mutex_destroy the QemuMutex in >>> qio_channel_finalize. >> >> OK, that sounds better. >> >> Dave >> > > Hi Dave: > Another way is protect channel_close in migration part, like > QemuMutex rp_mutex. > As Daniel mentioned, QIOChannel impls are only intended to a single thread. > https://www.mail-archive.com/qemu-devel@nongnu.org/msg530100.html > > which way is better? Does QIOChannel have the plan to support multi thread? > Not only channel_close need lock between different threads, > writev_buffer write also > need. > > thanks. > > I find qemu not call qemu_mutex_destroy to release rp_mutex in migration_instance_finalize:( although qemu_mutex_destroy is not necceesary, but it is a good practice to do. it's better we fixed it. >>> Thank you. >>> >>> > >>> > If (a) is false and we actually share a single QEMUFile then >>> > that race at the end happens. >>> > >>> > Dave >>> > >>> > >>> >> trace_qemu_file_fclose(); >>> >> return ret; >>> >> -- >>> >> 1.8.3.1 >>> >> >>> > -- >>> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >> -- >> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 977b9ae..87d0f05 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -52,6 +52,7 @@ struct QEMUFile { unsigned int iovcnt; int last_error; + QemuMutex lock; }; /* @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) f = g_new0(QEMUFile, 1); + qemu_mutex_init(&f->lock); f->opaque = opaque; f->ops = ops; return f; @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) ret = qemu_file_get_error(f); if (f->ops->close) { + qemu_mutex_lock(&f->lock); int ret2 = f->ops->close(f->opaque); + qemu_mutex_unlock(&f->lock); if (ret >= 0) { ret = ret2; } @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) if (f->last_error) { ret = f->last_error; } + qemu_mutex_destroy(&f->lock); g_free(f); trace_qemu_file_fclose(); return ret;