Message ID | 1527673416-31268-12-git-send-email-lidongchen@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Lidong Chen (jemmy858585@gmail.com) wrote: > If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function > maybe loop forever. so we should also poll the cm event fd, and when > receive any cm event, we consider some error happened. > > Signed-off-by: Lidong Chen <lidongchen@tencent.com> I don't understand enough about the way the infiniband fd's work to fully review this; so I'd appreciate if some one who does could comment/add their review. > --- > migration/rdma.c | 35 ++++++++++++++++++++++++----------- > 1 file changed, 24 insertions(+), 11 deletions(-) > > diff --git a/migration/rdma.c b/migration/rdma.c > index 1b9e261..d611a06 100644 > --- a/migration/rdma.c > +++ b/migration/rdma.c > @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, > */ > static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) > { > + struct rdma_cm_event *cm_event; > + int ret = -1; > + > /* > * Coroutine doesn't start until migration_fd_process_incoming() > * so don't yield unless we know we're running inside of a coroutine. > @@ -1504,25 +1507,35 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) > * But we need to be able to handle 'cancel' or an error > * without hanging forever. > */ > - while (!rdma->error_state && !rdma->received_error) { > - GPollFD pfds[1]; > + while (!rdma->error_state && !rdma->received_error) { > + GPollFD pfds[2]; > pfds[0].fd = rdma->comp_channel->fd; > pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; > + pfds[0].revents = 0; > + > + pfds[1].fd = rdma->channel->fd; > + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; > + pfds[1].revents = 0; > + > /* 0.1s timeout, should be fine for a 'cancel' */ > - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { > - case 1: /* fd active */ > - return 0; > + qemu_poll_ns(pfds, 2, 100 * 1000 * 1000); Shouldn't we still check the return value of this; if it's negative something has gone wrong. Dave > - case 0: /* Timeout, go around again */ > - break; > + if (pfds[1].revents) { > + ret = rdma_get_cm_event(rdma->channel, &cm_event); > + if (!ret) { > + rdma_ack_cm_event(cm_event); > + } > + error_report("receive cm event while wait comp channel," > + "cm event is %d", cm_event->event); > > - default: /* Error of some type - > - * I don't trust errno from qemu_poll_ns > - */ > - error_report("%s: poll failed", __func__); > + /* consider any rdma communication event as an error */ > return -EPIPE; > } > > + if (pfds[0].revents) { > + return 0; > + } > + > if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) { > /* Bail out and let the cancellation happen */ > return -EPIPE; > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Thu, May 31, 2018 at 1:33 AM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > * Lidong Chen (jemmy858585@gmail.com) wrote: >> If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function >> maybe loop forever. so we should also poll the cm event fd, and when >> receive any cm event, we consider some error happened. >> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com> > > I don't understand enough about the way the infiniband fd's work to > fully review this; so I'd appreciate if some one who does could > comment/add their review. Hi Avaid: we need your help. I also not find any document about the cq channel event fd and cm channel event f. Should we set the events to G_IO_IN | G_IO_HUP | G_IO_ERR? or G_IO_IN is enough? pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; Thanks. > >> --- >> migration/rdma.c | 35 ++++++++++++++++++++++++----------- >> 1 file changed, 24 insertions(+), 11 deletions(-) >> >> diff --git a/migration/rdma.c b/migration/rdma.c >> index 1b9e261..d611a06 100644 >> --- a/migration/rdma.c >> +++ b/migration/rdma.c >> @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, >> */ >> static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) >> { >> + struct rdma_cm_event *cm_event; >> + int ret = -1; >> + >> /* >> * Coroutine doesn't start until migration_fd_process_incoming() >> * so don't yield unless we know we're running inside of a coroutine. >> @@ -1504,25 +1507,35 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) >> * But we need to be able to handle 'cancel' or an error >> * without hanging forever. >> */ >> - while (!rdma->error_state && !rdma->received_error) { >> - GPollFD pfds[1]; >> + while (!rdma->error_state && !rdma->received_error) { >> + GPollFD pfds[2]; >> pfds[0].fd = rdma->comp_channel->fd; >> pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >> + pfds[0].revents = 0; >> + >> + pfds[1].fd = rdma->channel->fd; >> + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >> + pfds[1].revents = 0; >> + >> /* 0.1s timeout, should be fine for a 'cancel' */ >> - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { >> - case 1: /* fd active */ >> - return 0; >> + qemu_poll_ns(pfds, 2, 100 * 1000 * 1000); > > Shouldn't we still check the return value of this; if it's negative > something has gone wrong. I will fix this. Thanks. > > Dave > >> - case 0: /* Timeout, go around again */ >> - break; >> + if (pfds[1].revents) { >> + ret = rdma_get_cm_event(rdma->channel, &cm_event); >> + if (!ret) { >> + rdma_ack_cm_event(cm_event); >> + } >> + error_report("receive cm event while wait comp channel," >> + "cm event is %d", cm_event->event); >> >> - default: /* Error of some type - >> - * I don't trust errno from qemu_poll_ns >> - */ >> - error_report("%s: poll failed", __func__); >> + /* consider any rdma communication event as an error */ >> return -EPIPE; >> } >> >> + if (pfds[0].revents) { >> + return 0; >> + } >> + >> if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) { >> /* Bail out and let the cancellation happen */ >> return -EPIPE; >> -- >> 1.8.3.1 >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
+Gal Gal, please comment with our findings. Thanks! On 5/31/2018 10:36 AM, 858585 jemmy wrote: > On Thu, May 31, 2018 at 1:33 AM, Dr. David Alan Gilbert > <dgilbert@redhat.com> wrote: >> * Lidong Chen (jemmy858585@gmail.com) wrote: >>> If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function >>> maybe loop forever. so we should also poll the cm event fd, and when >>> receive any cm event, we consider some error happened. >>> >>> Signed-off-by: Lidong Chen <lidongchen@tencent.com> >> I don't understand enough about the way the infiniband fd's work to >> fully review this; so I'd appreciate if some one who does could >> comment/add their review. > Hi Avaid: > we need your help. I also not find any document about the cq > channel event fd and > cm channel event f. > Should we set the events to G_IO_IN | G_IO_HUP | G_IO_ERR? or > G_IO_IN is enough? > pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; > pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; > Thanks. > >>> --- >>> migration/rdma.c | 35 ++++++++++++++++++++++++----------- >>> 1 file changed, 24 insertions(+), 11 deletions(-) >>> >>> diff --git a/migration/rdma.c b/migration/rdma.c >>> index 1b9e261..d611a06 100644 >>> --- a/migration/rdma.c >>> +++ b/migration/rdma.c >>> @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, >>> */ >>> static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) >>> { >>> + struct rdma_cm_event *cm_event; >>> + int ret = -1; >>> + >>> /* >>> * Coroutine doesn't start until migration_fd_process_incoming() >>> * so don't yield unless we know we're running inside of a coroutine. >>> @@ -1504,25 +1507,35 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) >>> * But we need to be able to handle 'cancel' or an error >>> * without hanging forever. >>> */ >>> - while (!rdma->error_state && !rdma->received_error) { >>> - GPollFD pfds[1]; >>> + while (!rdma->error_state && !rdma->received_error) { >>> + GPollFD pfds[2]; >>> pfds[0].fd = rdma->comp_channel->fd; >>> pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >>> + pfds[0].revents = 0; >>> + >>> + pfds[1].fd = rdma->channel->fd; >>> + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >>> + pfds[1].revents = 0; >>> + >>> /* 0.1s timeout, should be fine for a 'cancel' */ >>> - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { >>> - case 1: /* fd active */ >>> - return 0; >>> + qemu_poll_ns(pfds, 2, 100 * 1000 * 1000); >> Shouldn't we still check the return value of this; if it's negative >> something has gone wrong. > I will fix this. > Thanks. > >> Dave >> >>> - case 0: /* Timeout, go around again */ >>> - break; >>> + if (pfds[1].revents) { >>> + ret = rdma_get_cm_event(rdma->channel, &cm_event); >>> + if (!ret) { >>> + rdma_ack_cm_event(cm_event); >>> + } >>> + error_report("receive cm event while wait comp channel," >>> + "cm event is %d", cm_event->event); >>> >>> - default: /* Error of some type - >>> - * I don't trust errno from qemu_poll_ns >>> - */ >>> - error_report("%s: poll failed", __func__); >>> + /* consider any rdma communication event as an error */ >>> return -EPIPE; >>> } >>> >>> + if (pfds[0].revents) { >>> + return 0; >>> + } >>> + >>> if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) { >>> /* Bail out and let the cancellation happen */ >>> return -EPIPE; >>> -- >>> 1.8.3.1 >>> >> -- >> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Sun, Jun 3, 2018 at 11:04 PM, Aviad Yehezkel <aviadye@dev.mellanox.co.il> wrote: > +Gal > > Gal, please comment with our findings. some suggestion from Gal: 1.Regarding the GIOConditions for the FPollFD.events/revents: G_IO_IN is enough for cm_channel – if it is not empty, it will return POLLIN | POLLRDNORM, but you can also use G_IO_HUP | G_IO_ERR just to be safe. 2.Please note that you are not currently checking for error return values on qemu_poll_ns, rdma_get_cm_event and rdma_ack_cm_event. 3.should consider checking for specific RDMA_CM_EVENT types: RDMA_CM_EVENT_DISCONNECTED, RDMA_CM_EVENT_DEVICE_REMOVAL.for example, receiving RDMA_CM_EVENT_ADDR_CHANGE should not result in error 4.it is better to first poll the CQ, and only if it has no new CQEs poll the eventsQ. This way, you will not go into error even if you’ve got a CQE. I will send new version patch. > > Thanks! > > > On 5/31/2018 10:36 AM, 858585 jemmy wrote: >> >> On Thu, May 31, 2018 at 1:33 AM, Dr. David Alan Gilbert >> <dgilbert@redhat.com> wrote: >>> >>> * Lidong Chen (jemmy858585@gmail.com) wrote: >>>> >>>> If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function >>>> maybe loop forever. so we should also poll the cm event fd, and when >>>> receive any cm event, we consider some error happened. >>>> >>>> Signed-off-by: Lidong Chen <lidongchen@tencent.com> >>> >>> I don't understand enough about the way the infiniband fd's work to >>> fully review this; so I'd appreciate if some one who does could >>> comment/add their review. >> >> Hi Avaid: >> we need your help. I also not find any document about the cq >> channel event fd and >> cm channel event f. >> Should we set the events to G_IO_IN | G_IO_HUP | G_IO_ERR? or >> G_IO_IN is enough? >> pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >> pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >> Thanks. >> >>>> --- >>>> migration/rdma.c | 35 ++++++++++++++++++++++++----------- >>>> 1 file changed, 24 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/migration/rdma.c b/migration/rdma.c >>>> index 1b9e261..d611a06 100644 >>>> --- a/migration/rdma.c >>>> +++ b/migration/rdma.c >>>> @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, >>>> uint64_t *wr_id_out, >>>> */ >>>> static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) >>>> { >>>> + struct rdma_cm_event *cm_event; >>>> + int ret = -1; >>>> + >>>> /* >>>> * Coroutine doesn't start until migration_fd_process_incoming() >>>> * so don't yield unless we know we're running inside of a >>>> coroutine. >>>> @@ -1504,25 +1507,35 @@ static int >>>> qemu_rdma_wait_comp_channel(RDMAContext *rdma) >>>> * But we need to be able to handle 'cancel' or an error >>>> * without hanging forever. >>>> */ >>>> - while (!rdma->error_state && !rdma->received_error) { >>>> - GPollFD pfds[1]; >>>> + while (!rdma->error_state && !rdma->received_error) { >>>> + GPollFD pfds[2]; >>>> pfds[0].fd = rdma->comp_channel->fd; >>>> pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >>>> + pfds[0].revents = 0; >>>> + >>>> + pfds[1].fd = rdma->channel->fd; >>>> + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; >>>> + pfds[1].revents = 0; >>>> + >>>> /* 0.1s timeout, should be fine for a 'cancel' */ >>>> - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { >>>> - case 1: /* fd active */ >>>> - return 0; >>>> + qemu_poll_ns(pfds, 2, 100 * 1000 * 1000); >>> >>> Shouldn't we still check the return value of this; if it's negative >>> something has gone wrong. >> >> I will fix this. >> Thanks. >> >>> Dave >>> >>>> - case 0: /* Timeout, go around again */ >>>> - break; >>>> + if (pfds[1].revents) { >>>> + ret = rdma_get_cm_event(rdma->channel, &cm_event); >>>> + if (!ret) { >>>> + rdma_ack_cm_event(cm_event); >>>> + } >>>> + error_report("receive cm event while wait comp >>>> channel," >>>> + "cm event is %d", cm_event->event); >>>> >>>> - default: /* Error of some type - >>>> - * I don't trust errno from qemu_poll_ns >>>> - */ >>>> - error_report("%s: poll failed", __func__); >>>> + /* consider any rdma communication event as an error */ >>>> return -EPIPE; >>>> } >>>> >>>> + if (pfds[0].revents) { >>>> + return 0; >>>> + } >>>> + >>>> if (migrate_get_current()->state == >>>> MIGRATION_STATUS_CANCELLING) { >>>> /* Bail out and let the cancellation happen */ >>>> return -EPIPE; >>>> -- >>>> 1.8.3.1 >>>> >>> -- >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > >
diff --git a/migration/rdma.c b/migration/rdma.c index 1b9e261..d611a06 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, */ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) { + struct rdma_cm_event *cm_event; + int ret = -1; + /* * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. @@ -1504,25 +1507,35 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * But we need to be able to handle 'cancel' or an error * without hanging forever. */ - while (!rdma->error_state && !rdma->received_error) { - GPollFD pfds[1]; + while (!rdma->error_state && !rdma->received_error) { + GPollFD pfds[2]; pfds[0].fd = rdma->comp_channel->fd; pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[0].revents = 0; + + pfds[1].fd = rdma->channel->fd; + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[1].revents = 0; + /* 0.1s timeout, should be fine for a 'cancel' */ - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { - case 1: /* fd active */ - return 0; + qemu_poll_ns(pfds, 2, 100 * 1000 * 1000); - case 0: /* Timeout, go around again */ - break; + if (pfds[1].revents) { + ret = rdma_get_cm_event(rdma->channel, &cm_event); + if (!ret) { + rdma_ack_cm_event(cm_event); + } + error_report("receive cm event while wait comp channel," + "cm event is %d", cm_event->event); - default: /* Error of some type - - * I don't trust errno from qemu_poll_ns - */ - error_report("%s: poll failed", __func__); + /* consider any rdma communication event as an error */ return -EPIPE; } + if (pfds[0].revents) { + return 0; + } + if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) { /* Bail out and let the cancellation happen */ return -EPIPE;
If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function maybe loop forever. so we should also poll the cm event fd, and when receive any cm event, we consider some error happened. Signed-off-by: Lidong Chen <lidongchen@tencent.com> --- migration/rdma.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)