Message ID | 1453465198-11000-1-git-send-email-lprosek@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Ladi, Adding Pankaj to CC, he too looked at this recently. On (Fri) 22 Jan 2016 [13:19:58], Ladi Prosek wrote: > If the guest adds a buffer to the virtio queue while another buffer > is still pending and hasn't been filled and returned by the rng > device, rng-random internally discards the pending request, which > leads to the second buffer getting stuck in the queue. For the guest > this manifests as delayed completion of reads from virtio-rng, i.e. > a read is completed only after another read is issued. > > This patch adds an internal queue of requests, analogous to what > rng-egd uses, to make sure that requests and responses are balanced > and correctly ordered. ... and this can lead to breaking migration (the queue of requests on the host needs to be migrated, else the new host will have no idea of the queue). I think we should limit the queue size to 1 instead. Multiple rng requests should not be common, because if we did have entropy, we'd just service the guest request and be done with it. If we haven't replied to the guest, it just means that the host itself is waiting for more entropy, or is waiting for the timeout before the guest's ratelimit is lifted. So, instead of fixing this using a queue, how about limiting the size of the vq to have just one element at a time? Thanks, Amit
Hi Amit, ----- Original Message ----- > Hi Ladi, > > Adding Pankaj to CC, he too looked at this recently. > > On (Fri) 22 Jan 2016 [13:19:58], Ladi Prosek wrote: > > If the guest adds a buffer to the virtio queue while another buffer > > is still pending and hasn't been filled and returned by the rng > > device, rng-random internally discards the pending request, which > > leads to the second buffer getting stuck in the queue. For the guest > > this manifests as delayed completion of reads from virtio-rng, i.e. > > a read is completed only after another read is issued. > > > > This patch adds an internal queue of requests, analogous to what > > rng-egd uses, to make sure that requests and responses are balanced > > and correctly ordered. > > ... and this can lead to breaking migration (the queue of requests on > the host needs to be migrated, else the new host will have no idea of > the queue). I was under the impression that clearing the queue pre-migration as implemented by the RngBackendClass::cancel_requests callback is enough. If it wasn't, the rgn-egd backend would be already broken as its queueing logic is pretty much identical. /** * rng_backend_cancel_requests: * @s: the backend to cancel all pending requests in * * Cancels all pending requests submitted by @rng_backend_request_entropy. This * should be used by a device during reset or in preparation for live migration * to stop tracking any request. */ void rng_backend_cancel_requests(RngBackend *s); Upon closer inspection though, this function appears to have no callers. Either I'm missing something or there's another bug to be fixed. > I think we should limit the queue size to 1 instead. Multiple rng > requests should not be common, because if we did have entropy, we'd > just service the guest request and be done with it. If we haven't > replied to the guest, it just means that the host itself is waiting > for more entropy, or is waiting for the timeout before the guest's > ratelimit is lifted. The scenario I had in mind is multiple processes in the guest requesting entropy at the same time, no ratelimit, and fast entropy source in the host. Being able to queue up requests would definitely help boost performance, I think I even benchmarked it but I must have lost the numbers. I can set it up again and rerun the benchmark if you're interested. > So, instead of fixing this using a queue, how about limiting the size > of the vq to have just one element at a time? I don't believe that this is a good solution. Although perfectly valid spec-wise, I can see how a one-element queue could confuse less than perfect driver implementations. Additionally, the driver would have to implement some kind of a guest-side queueing logic and serialize its requests or else be dropping them if the virtqueue is full. Overall, I don't think that it's completely crazy to call it a breaking change. > Thanks, > > Amit > >
On 03/02/2016 13:36, Amit Shah wrote: > ... and this can lead to breaking migration (the queue of requests on > the host needs to be migrated, else the new host will have no idea of > the queue). It is already migrated as part of virtio_rng_save's call to virtio_save. On the loading side, virtio_rng_process condenses all requests into one and chr_read fills in as many virtqueue buffers as possible from the single request. Cancel_requests seems useless. Paolo
Hi Ladi, I think this is fine if we have multiple requests from Guests and depending on entropy pool available we can honour individual requests and return with the entropy. Just one point I have is, Suppose we have multiple requests from Guests and we are returning if request length < 0 for any request. There might be pending requests which will get executed in next iteration?Can we honour them in same iteration? static void entropy_available(void *opaque) { ... ... while (s->requests != NULL) { ... if (len < 0 && errno == EAGAIN) { + return; + } ... Best regards, Pankaj > > > > On 03/02/2016 13:36, Amit Shah wrote: > > ... and this can lead to breaking migration (the queue of requests on > > the host needs to be migrated, else the new host will have no idea of > > the queue). > > It is already migrated as part of virtio_rng_save's call to virtio_save. > On the loading side, virtio_rng_process condenses all requests into one > and chr_read fills in as many virtqueue buffers as possible from the > single request. > > Cancel_requests seems useless. > > Paolo >
----- Original Message ----- > > > On 03/02/2016 13:36, Amit Shah wrote: > > ... and this can lead to breaking migration (the queue of requests on > > the host needs to be migrated, else the new host will have no idea of > > the queue). > > It is already migrated as part of virtio_rng_save's call to virtio_save. > On the loading side, virtio_rng_process condenses all requests into one > and chr_read fills in as many virtqueue buffers as possible from the > single request. Thanks! So this looks broken. The contract between virtio-rng and the rng backend is "one chr_read callback per one rng_backend_request_entropy", regardless of the request size. It guarantees that chr_read will get at least one byte, nothing else. If the one condensed request is bigger than what the backend is able to give at the moment, there may be unsatisfied virtio buffers left in the queue. One way of fixing this would be to have chr_read call virtio_rng_process if it ran out of backend data but not out of virtio buffers. Otherwise those buffers will be slowly drained by the rate limit timer, which is not optimal (especially in the absence of rate limit :) > Cancel_requests seems useless. I'll delete that code. > Paolo > >
Hi Pankaj, ----- Original Message ----- > > Hi Ladi, > > I think this is fine if we have multiple requests from Guests and > depending on entropy pool available we can honour individual requests > and return with the entropy. > > Just one point I have is, Suppose we have multiple requests from Guests > and we are returning if request length < 0 for any request. There might > be pending requests which will get executed in next iteration?Can we honour > them in same iteration? > > static void entropy_available(void *opaque) > { > ... > ... > while (s->requests != NULL) { > ... > if (len < 0 && errno == EAGAIN) { > + return; > + } > ... All requests in the queue are waiting for data to be readable from the same fd. If the fd has no data (read fails with EAGAIN), there's no point in hammering on the fd in a loop. Returning seems to be the right thing to do. Please let me know if I misunderstood your comment. > Best regards, > Pankaj > > > > > > > > On 03/02/2016 13:36, Amit Shah wrote: > > > ... and this can lead to breaking migration (the queue of requests on > > > the host needs to be migrated, else the new host will have no idea of > > > the queue). > > > > It is already migrated as part of virtio_rng_save's call to virtio_save. > > On the loading side, virtio_rng_process condenses all requests into one > > and chr_read fills in as many virtqueue buffers as possible from the > > single request. > > > > Cancel_requests seems useless. > > > > Paolo > > > >
----- Original Message ----- > ----- Original Message ----- > > > > > > On 03/02/2016 13:36, Amit Shah wrote: > > > ... and this can lead to breaking migration (the queue of requests on > > > the host needs to be migrated, else the new host will have no idea of > > > the queue). > > > > It is already migrated as part of virtio_rng_save's call to virtio_save. > > On the loading side, virtio_rng_process condenses all requests into one > > and chr_read fills in as many virtqueue buffers as possible from the > > single request. > > Thanks! So this looks broken. The contract between virtio-rng and the rng > backend is "one chr_read callback per one rng_backend_request_entropy", > regardless of the request size. It guarantees that chr_read will get at > least one byte, nothing else. If the one condensed request is bigger than > what the backend is able to give at the moment, there may be unsatisfied > virtio buffers left in the queue. > > One way of fixing this would be to have chr_read call virtio_rng_process > if it ran out of backend data but not out of virtio buffers. Otherwise > those buffers will be slowly drained by the rate limit timer, which is > not optimal (especially in the absence of rate limit :) Basically, we could just revert this commit: commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e Author: Amit Shah <amit.shah@redhat.com> Date: Wed Nov 21 11:21:19 2012 +0530 virtio-rng: remove extra request for entropy If we got fewer bytes from the backend than requested, don't poke the backend for more bytes; the guest will ask for more (or if the guest has already asked for more, the backend knows about it via handle_input() > > Cancel_requests seems useless. > > I'll delete that code. > > > Paolo > > > > > >
> > Hi Pankaj, > > ----- Original Message ----- > > > > Hi Ladi, > > > > I think this is fine if we have multiple requests from Guests and > > depending on entropy pool available we can honour individual requests > > and return with the entropy. > > > > Just one point I have is, Suppose we have multiple requests from Guests > > and we are returning if request length < 0 for any request. There might > > be pending requests which will get executed in next iteration?Can we honour > > them in same iteration? > > > > static void entropy_available(void *opaque) > > { > > ... > > ... > > while (s->requests != NULL) { > > ... > > if (len < 0 && errno == EAGAIN) { > > + return; > > + } > > ... > > All requests in the queue are waiting for data to be readable from the same > fd. If the fd has no data (read fails with EAGAIN), there's no point in > hammering on the fd in a loop. Returning seems to be the right thing to do. > > Please let me know if I misunderstood your comment. You are right. Thanks, Pankaj > > > Best regards, > > Pankaj > > > > > > > > > > > > On 03/02/2016 13:36, Amit Shah wrote: > > > > ... and this can lead to breaking migration (the queue of requests on > > > > the host needs to be migrated, else the new host will have no idea of > > > > the queue). > > > > > > It is already migrated as part of virtio_rng_save's call to virtio_save. > > > On the loading side, virtio_rng_process condenses all requests into one > > > and chr_read fills in as many virtqueue buffers as possible from the > > > single request. > > > > > > Cancel_requests seems useless. > > > > > > Paolo > > > > > > > >
>>> It is already migrated as part of virtio_rng_save's call to virtio_save. >>> On the loading side, virtio_rng_process condenses all requests into one >>> and chr_read fills in as many virtqueue buffers as possible from the >>> single request. >> >> Thanks! So this looks broken. The contract between virtio-rng and the rng >> backend is "one chr_read callback per one rng_backend_request_entropy", >> regardless of the request size. It guarantees that chr_read will get at >> least one byte, nothing else. If the one condensed request is bigger than >> what the backend is able to give at the moment, there may be unsatisfied >> virtio buffers left in the queue. >> >> One way of fixing this would be to have chr_read call virtio_rng_process >> if it ran out of backend data but not out of virtio buffers. Otherwise >> those buffers will be slowly drained by the rate limit timer, which is >> not optimal (especially in the absence of rate limit :) Nice observation! And the commit you mention below seems to be the right way to fix it indeed. Amit, do you recall why this was done? > Basically, we could just revert this commit: > > commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e > Author: Amit Shah <amit.shah@redhat.com> > Date: Wed Nov 21 11:21:19 2012 +0530 > > virtio-rng: remove extra request for entropy > > If we got fewer bytes from the backend than requested, don't poke the > backend for more bytes; the guest will ask for more (or if the guest has > already asked for more, the backend knows about it via handle_input() I've now reviewed your patch more carefully, I will make some comments in reply to the patch directly. Paolo
On (Thu) 04 Feb 2016 [13:07:35], Ladi Prosek wrote: > ----- Original Message ----- > > ----- Original Message ----- > > > > > > > > > On 03/02/2016 13:36, Amit Shah wrote: > > > > ... and this can lead to breaking migration (the queue of requests on > > > > the host needs to be migrated, else the new host will have no idea of > > > > the queue). > > > > > > It is already migrated as part of virtio_rng_save's call to virtio_save. > > > On the loading side, virtio_rng_process condenses all requests into one > > > and chr_read fills in as many virtqueue buffers as possible from the > > > single request. > > > > Thanks! So this looks broken. The contract between virtio-rng and the rng > > backend is "one chr_read callback per one rng_backend_request_entropy", > > regardless of the request size. It guarantees that chr_read will get at > > least one byte, nothing else. If the one condensed request is bigger than > > what the backend is able to give at the moment, there may be unsatisfied > > virtio buffers left in the queue. > > > > One way of fixing this would be to have chr_read call virtio_rng_process > > if it ran out of backend data but not out of virtio buffers. Otherwise > > those buffers will be slowly drained by the rate limit timer, which is > > not optimal (especially in the absence of rate limit :) > > Basically, we could just revert this commit: > > commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e > Author: Amit Shah <amit.shah@redhat.com> > Date: Wed Nov 21 11:21:19 2012 +0530 > > virtio-rng: remove extra request for entropy > > If we got fewer bytes from the backend than requested, don't poke the > backend for more bytes; the guest will ask for more (or if the guest has > already asked for more, the backend knows about it via handle_input() OK, agreed. It makes sense to revert this so the live migration scenario is sane. Without live migration reverting this commit doesn't get us any benefit, and that's why I had removed it. Do you want to post a patch reverting this too? Thanks, Amit
On Thu, Mar 3, 2016 at 6:05 AM, Amit Shah <amit.shah@redhat.com> wrote: > On (Thu) 04 Feb 2016 [13:07:35], Ladi Prosek wrote: >> ----- Original Message ----- >> > ----- Original Message ----- >> > > >> > > >> > > On 03/02/2016 13:36, Amit Shah wrote: >> > > > ... and this can lead to breaking migration (the queue of requests on >> > > > the host needs to be migrated, else the new host will have no idea of >> > > > the queue). >> > > >> > > It is already migrated as part of virtio_rng_save's call to virtio_save. >> > > On the loading side, virtio_rng_process condenses all requests into one >> > > and chr_read fills in as many virtqueue buffers as possible from the >> > > single request. >> > >> > Thanks! So this looks broken. The contract between virtio-rng and the rng >> > backend is "one chr_read callback per one rng_backend_request_entropy", >> > regardless of the request size. It guarantees that chr_read will get at >> > least one byte, nothing else. If the one condensed request is bigger than >> > what the backend is able to give at the moment, there may be unsatisfied >> > virtio buffers left in the queue. >> > >> > One way of fixing this would be to have chr_read call virtio_rng_process >> > if it ran out of backend data but not out of virtio buffers. Otherwise >> > those buffers will be slowly drained by the rate limit timer, which is >> > not optimal (especially in the absence of rate limit :) >> >> Basically, we could just revert this commit: >> >> commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e >> Author: Amit Shah <amit.shah@redhat.com> >> Date: Wed Nov 21 11:21:19 2012 +0530 >> >> virtio-rng: remove extra request for entropy >> >> If we got fewer bytes from the backend than requested, don't poke the >> backend for more bytes; the guest will ask for more (or if the guest has >> already asked for more, the backend knows about it via handle_input() > > OK, agreed. > > It makes sense to revert this so the live migration scenario is sane. > Without live migration reverting this commit doesn't get us any > benefit, and that's why I had removed it. > > Do you want to post a patch reverting this too? Yes, I've had this ready for some time, just haven't posted it yet because it depends on this series. Without this series reverting the commit completes a nice loop in the callgraph chr_read -> virtio_rng_process -> rng_backend_request_entropy -> chr_read, leading to stack overflow. I've actually hit it in testing. > Thanks, > > Amit
diff --git a/backends/rng-random.c b/backends/rng-random.c index 4e51f46..3f1fd21 100644 --- a/backends/rng-random.c +++ b/backends/rng-random.c @@ -22,10 +22,16 @@ struct RndRandom int fd; char *filename; + GSList *requests; +}; + +typedef struct RngRequest +{ EntropyReceiveFunc *receive_func; void *opaque; size_t size; -}; +} RngRequest; + /** * A simple and incomplete backend to request entropy from /dev/random. @@ -37,19 +43,27 @@ struct RndRandom static void entropy_available(void *opaque) { RndRandom *s = RNG_RANDOM(opaque); - uint8_t buffer[s->size]; - ssize_t len; - len = read(s->fd, buffer, s->size); - if (len < 0 && errno == EAGAIN) { - return; - } - g_assert(len != -1); + while (s->requests != NULL) { + RngRequest *req = s->requests->data; + uint8_t buffer[req->size]; + ssize_t len; + + len = read(s->fd, buffer, req->size); + if (len < 0 && errno == EAGAIN) { + return; + } + g_assert(len != -1); - s->receive_func(s->opaque, buffer, len); - s->receive_func = NULL; + req->receive_func(req->opaque, buffer, len); - qemu_set_fd_handler(s->fd, NULL, NULL, NULL); + s->requests = g_slist_remove_link(s->requests, s->requests); + g_free(req); + } + + if (s->requests == NULL) { + qemu_set_fd_handler(s->fd, NULL, NULL, NULL); + } } static void rng_random_request_entropy(RngBackend *b, size_t size, @@ -57,16 +71,36 @@ static void rng_random_request_entropy(RngBackend *b, size_t size, void *opaque) { RndRandom *s = RNG_RANDOM(b); + RngRequest *req; - if (s->receive_func) { - s->receive_func(s->opaque, NULL, 0); + req = g_malloc(sizeof(*req)); + + req->size = size; + req->receive_func = receive_entropy; + req->opaque = opaque; + + s->requests = g_slist_append(s->requests, req); + if (s->requests->data == req) { + qemu_set_fd_handler(s->fd, entropy_available, NULL, s); + } +} + +static void rng_random_free_requests(RndRandom *s) +{ + GSList *i; + + for (i = s->requests; i; i = i->next) { + g_free(i->data); } - s->receive_func = receive_entropy; - s->opaque = opaque; - s->size = size; + g_slist_free(s->requests); + s->requests = NULL; +} - qemu_set_fd_handler(s->fd, entropy_available, NULL, s); +static void rng_random_cancel_requests(RngBackend *b) +{ + RndRandom *s = RNG_RANDOM(b); + rng_random_free_requests(s); } static void rng_random_opened(RngBackend *b, Error **errp) @@ -129,6 +163,8 @@ static void rng_random_finalize(Object *obj) } g_free(s->filename); + + rng_random_free_requests(s); } static void rng_random_class_init(ObjectClass *klass, void *data) @@ -136,6 +172,7 @@ static void rng_random_class_init(ObjectClass *klass, void *data) RngBackendClass *rbc = RNG_BACKEND_CLASS(klass); rbc->request_entropy = rng_random_request_entropy; + rbc->cancel_requests = rng_random_cancel_requests; rbc->opened = rng_random_opened; }
If the guest adds a buffer to the virtio queue while another buffer is still pending and hasn't been filled and returned by the rng device, rng-random internally discards the pending request, which leads to the second buffer getting stuck in the queue. For the guest this manifests as delayed completion of reads from virtio-rng, i.e. a read is completed only after another read is issued. This patch adds an internal queue of requests, analogous to what rng-egd uses, to make sure that requests and responses are balanced and correctly ordered. Signed-off-by: Ladi Prosek <lprosek@redhat.com> --- backends/rng-random.c | 71 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 54 insertions(+), 17 deletions(-)