diff mbox

rng-random: implement request queue

Message ID 1453465198-11000-1-git-send-email-lprosek@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ladi Prosek Jan. 22, 2016, 12:19 p.m. UTC
If the guest adds a buffer to the virtio queue while another buffer
is still pending and hasn't been filled and returned by the rng
device, rng-random internally discards the pending request, which
leads to the second buffer getting stuck in the queue. For the guest
this manifests as delayed completion of reads from virtio-rng, i.e.
a read is completed only after another read is issued.

This patch adds an internal queue of requests, analogous to what
rng-egd uses, to make sure that requests and responses are balanced
and correctly ordered.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 backends/rng-random.c | 71 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 54 insertions(+), 17 deletions(-)

Comments

Amit Shah Feb. 3, 2016, 12:36 p.m. UTC | #1
Hi Ladi,

Adding Pankaj to CC, he too looked at this recently.

On (Fri) 22 Jan 2016 [13:19:58], Ladi Prosek wrote:
> If the guest adds a buffer to the virtio queue while another buffer
> is still pending and hasn't been filled and returned by the rng
> device, rng-random internally discards the pending request, which
> leads to the second buffer getting stuck in the queue. For the guest
> this manifests as delayed completion of reads from virtio-rng, i.e.
> a read is completed only after another read is issued.
> 
> This patch adds an internal queue of requests, analogous to what
> rng-egd uses, to make sure that requests and responses are balanced
> and correctly ordered.

... and this can lead to breaking migration (the queue of requests on
the host needs to be migrated, else the new host will have no idea of
the queue).

I think we should limit the queue size to 1 instead.  Multiple rng
requests should not be common, because if we did have entropy, we'd
just service the guest request and be done with it.  If we haven't
replied to the guest, it just means that the host itself is waiting
for more entropy, or is waiting for the timeout before the guest's
ratelimit is lifted.

So, instead of fixing this using a queue, how about limiting the size
of the vq to have just one element at a time?

Thanks,

		Amit
Ladi Prosek Feb. 3, 2016, 6:02 p.m. UTC | #2
Hi Amit,

----- Original Message -----
> Hi Ladi,
> 
> Adding Pankaj to CC, he too looked at this recently.
> 
> On (Fri) 22 Jan 2016 [13:19:58], Ladi Prosek wrote:
> > If the guest adds a buffer to the virtio queue while another buffer
> > is still pending and hasn't been filled and returned by the rng
> > device, rng-random internally discards the pending request, which
> > leads to the second buffer getting stuck in the queue. For the guest
> > this manifests as delayed completion of reads from virtio-rng, i.e.
> > a read is completed only after another read is issued.
> > 
> > This patch adds an internal queue of requests, analogous to what
> > rng-egd uses, to make sure that requests and responses are balanced
> > and correctly ordered.
> 
> ... and this can lead to breaking migration (the queue of requests on
> the host needs to be migrated, else the new host will have no idea of
> the queue).

I was under the impression that clearing the queue pre-migration as
implemented by the RngBackendClass::cancel_requests callback is enough.
If it wasn't, the rgn-egd backend would be already broken as its
queueing logic is pretty much identical.

/**
 * rng_backend_cancel_requests:
 * @s: the backend to cancel all pending requests in
 *
 * Cancels all pending requests submitted by @rng_backend_request_entropy.  This
 * should be used by a device during reset or in preparation for live migration
 * to stop tracking any request.
 */
void rng_backend_cancel_requests(RngBackend *s);

Upon closer inspection though, this function appears to have no callers.
Either I'm missing something or there's another bug to be fixed.

> I think we should limit the queue size to 1 instead.  Multiple rng
> requests should not be common, because if we did have entropy, we'd
> just service the guest request and be done with it.  If we haven't
> replied to the guest, it just means that the host itself is waiting
> for more entropy, or is waiting for the timeout before the guest's
> ratelimit is lifted.

The scenario I had in mind is multiple processes in the guest
requesting entropy at the same time, no ratelimit, and fast entropy
source in the host. Being able to queue up requests would definitely
help boost performance, I think I even benchmarked it but I must
have lost the numbers. I can set it up again and rerun the benchmark
if you're interested.
 
> So, instead of fixing this using a queue, how about limiting the size
> of the vq to have just one element at a time?

I don't believe that this is a good solution. Although perfectly valid
spec-wise, I can see how a one-element queue could confuse less than
perfect driver implementations. Additionally, the driver would have to
implement some kind of a guest-side queueing logic and serialize its
requests or else be dropping them if the virtqueue is full. Overall,
I don't think that it's completely crazy to call it a breaking change.

> Thanks,
> 
> 		Amit
> 
>
Paolo Bonzini Feb. 3, 2016, 6:44 p.m. UTC | #3
On 03/02/2016 13:36, Amit Shah wrote:
> ... and this can lead to breaking migration (the queue of requests on
> the host needs to be migrated, else the new host will have no idea of
> the queue).

It is already migrated as part of virtio_rng_save's call to virtio_save.
 On the loading side, virtio_rng_process condenses all requests into one
and chr_read fills in as many virtqueue buffers as possible from the
single request.

Cancel_requests seems useless.

Paolo
Pankaj Gupta Feb. 4, 2016, 8:53 a.m. UTC | #4
Hi Ladi,

I think this is fine if we have multiple requests from Guests and 
depending on entropy pool available we can honour individual requests
and return with the entropy.

Just one point I have is, Suppose we have multiple requests from Guests  
and we are returning if request length < 0 for any request. There might 
be pending requests which will get executed in next iteration?Can we honour
them in same iteration?

static void entropy_available(void *opaque)
 {
...
...
 while (s->requests != NULL) {
...
if (len < 0 && errno == EAGAIN) {
+            return;
+        }
...

Best regards,
Pankaj
> 
> 
> 
> On 03/02/2016 13:36, Amit Shah wrote:
> > ... and this can lead to breaking migration (the queue of requests on
> > the host needs to be migrated, else the new host will have no idea of
> > the queue).
> 
> It is already migrated as part of virtio_rng_save's call to virtio_save.
>  On the loading side, virtio_rng_process condenses all requests into one
> and chr_read fills in as many virtqueue buffers as possible from the
> single request.
> 
> Cancel_requests seems useless.
> 
> Paolo
>
Ladi Prosek Feb. 4, 2016, 5:24 p.m. UTC | #5
----- Original Message -----
> 
> 
> On 03/02/2016 13:36, Amit Shah wrote:
> > ... and this can lead to breaking migration (the queue of requests on
> > the host needs to be migrated, else the new host will have no idea of
> > the queue).
> 
> It is already migrated as part of virtio_rng_save's call to virtio_save.
>  On the loading side, virtio_rng_process condenses all requests into one
> and chr_read fills in as many virtqueue buffers as possible from the
> single request.

Thanks! So this looks broken. The contract between virtio-rng and the rng
backend is "one chr_read callback per one rng_backend_request_entropy",
regardless of the request size. It guarantees that chr_read will get at
least one byte, nothing else. If the one condensed request is bigger than
what the backend is able to give at the moment, there may be unsatisfied
virtio buffers left in the queue.

One way of fixing this would be to have chr_read call virtio_rng_process
if it ran out of backend data but not out of virtio buffers. Otherwise
those buffers will be slowly drained by the rate limit timer, which is
not optimal (especially in the absence of rate limit :)
 
> Cancel_requests seems useless.

I'll delete that code.

> Paolo
> 
>
Ladi Prosek Feb. 4, 2016, 5:36 p.m. UTC | #6
Hi Pankaj,

----- Original Message -----
> 
> Hi Ladi,
> 
> I think this is fine if we have multiple requests from Guests and
> depending on entropy pool available we can honour individual requests
> and return with the entropy.
> 
> Just one point I have is, Suppose we have multiple requests from Guests
> and we are returning if request length < 0 for any request. There might
> be pending requests which will get executed in next iteration?Can we honour
> them in same iteration?
> 
> static void entropy_available(void *opaque)
>  {
> ...
> ...
>  while (s->requests != NULL) {
> ...
> if (len < 0 && errno == EAGAIN) {
> +            return;
> +        }
> ...

All requests in the queue are waiting for data to be readable from the same
fd. If the fd has no data (read fails with EAGAIN), there's no point in
hammering on the fd in a loop. Returning seems to be the right thing to do.

Please let me know if I misunderstood your comment.
 
> Best regards,
> Pankaj
> > 
> > 
> > 
> > On 03/02/2016 13:36, Amit Shah wrote:
> > > ... and this can lead to breaking migration (the queue of requests on
> > > the host needs to be migrated, else the new host will have no idea of
> > > the queue).
> > 
> > It is already migrated as part of virtio_rng_save's call to virtio_save.
> >  On the loading side, virtio_rng_process condenses all requests into one
> > and chr_read fills in as many virtqueue buffers as possible from the
> > single request.
> > 
> > Cancel_requests seems useless.
> > 
> > Paolo
> > 
> 
>
Ladi Prosek Feb. 4, 2016, 6:07 p.m. UTC | #7
----- Original Message -----
> ----- Original Message -----
> > 
> > 
> > On 03/02/2016 13:36, Amit Shah wrote:
> > > ... and this can lead to breaking migration (the queue of requests on
> > > the host needs to be migrated, else the new host will have no idea of
> > > the queue).
> > 
> > It is already migrated as part of virtio_rng_save's call to virtio_save.
> >  On the loading side, virtio_rng_process condenses all requests into one
> > and chr_read fills in as many virtqueue buffers as possible from the
> > single request.
> 
> Thanks! So this looks broken. The contract between virtio-rng and the rng
> backend is "one chr_read callback per one rng_backend_request_entropy",
> regardless of the request size. It guarantees that chr_read will get at
> least one byte, nothing else. If the one condensed request is bigger than
> what the backend is able to give at the moment, there may be unsatisfied
> virtio buffers left in the queue.
> 
> One way of fixing this would be to have chr_read call virtio_rng_process
> if it ran out of backend data but not out of virtio buffers. Otherwise
> those buffers will be slowly drained by the rate limit timer, which is
> not optimal (especially in the absence of rate limit :)

Basically, we could just revert this commit:

commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e
Author: Amit Shah <amit.shah@redhat.com>
Date:   Wed Nov 21 11:21:19 2012 +0530

    virtio-rng: remove extra request for entropy
    
    If we got fewer bytes from the backend than requested, don't poke the
    backend for more bytes; the guest will ask for more (or if the guest has
    already asked for more, the backend knows about it via handle_input()

> > Cancel_requests seems useless.
> 
> I'll delete that code.
> 
> > Paolo
> > 
> > 
> 
>
Pankaj Gupta Feb. 5, 2016, 5:31 a.m. UTC | #8
> 
> Hi Pankaj,
> 
> ----- Original Message -----
> > 
> > Hi Ladi,
> > 
> > I think this is fine if we have multiple requests from Guests and
> > depending on entropy pool available we can honour individual requests
> > and return with the entropy.
> > 
> > Just one point I have is, Suppose we have multiple requests from Guests
> > and we are returning if request length < 0 for any request. There might
> > be pending requests which will get executed in next iteration?Can we honour
> > them in same iteration?
> > 
> > static void entropy_available(void *opaque)
> >  {
> > ...
> > ...
> >  while (s->requests != NULL) {
> > ...
> > if (len < 0 && errno == EAGAIN) {
> > +            return;
> > +        }
> > ...
> 
> All requests in the queue are waiting for data to be readable from the same
> fd. If the fd has no data (read fails with EAGAIN), there's no point in
> hammering on the fd in a loop. Returning seems to be the right thing to do.
> 
> Please let me know if I misunderstood your comment.

You are right.

Thanks,
Pankaj
>  
> > Best regards,
> > Pankaj
> > > 
> > > 
> > > 
> > > On 03/02/2016 13:36, Amit Shah wrote:
> > > > ... and this can lead to breaking migration (the queue of requests on
> > > > the host needs to be migrated, else the new host will have no idea of
> > > > the queue).
> > > 
> > > It is already migrated as part of virtio_rng_save's call to virtio_save.
> > >  On the loading side, virtio_rng_process condenses all requests into one
> > > and chr_read fills in as many virtqueue buffers as possible from the
> > > single request.
> > > 
> > > Cancel_requests seems useless.
> > > 
> > > Paolo
> > > 
> > 
> > 
>
Paolo Bonzini Feb. 5, 2016, 8:32 a.m. UTC | #9
>>> It is already migrated as part of virtio_rng_save's call to virtio_save.
>>>  On the loading side, virtio_rng_process condenses all requests into one
>>> and chr_read fills in as many virtqueue buffers as possible from the
>>> single request.
>>
>> Thanks! So this looks broken. The contract between virtio-rng and the rng
>> backend is "one chr_read callback per one rng_backend_request_entropy",
>> regardless of the request size. It guarantees that chr_read will get at
>> least one byte, nothing else. If the one condensed request is bigger than
>> what the backend is able to give at the moment, there may be unsatisfied
>> virtio buffers left in the queue.
>>
>> One way of fixing this would be to have chr_read call virtio_rng_process
>> if it ran out of backend data but not out of virtio buffers. Otherwise
>> those buffers will be slowly drained by the rate limit timer, which is
>> not optimal (especially in the absence of rate limit :)

Nice observation!  And the commit you mention below seems to be the
right way to fix it indeed.  Amit, do you recall why this was done?

> Basically, we could just revert this commit:
> 
> commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e
> Author: Amit Shah <amit.shah@redhat.com>
> Date:   Wed Nov 21 11:21:19 2012 +0530
> 
>     virtio-rng: remove extra request for entropy
>     
>     If we got fewer bytes from the backend than requested, don't poke the
>     backend for more bytes; the guest will ask for more (or if the guest has
>     already asked for more, the backend knows about it via handle_input()

I've now reviewed your patch more carefully, I will make some comments
in reply to the patch directly.

Paolo
Amit Shah March 3, 2016, 5:05 a.m. UTC | #10
On (Thu) 04 Feb 2016 [13:07:35], Ladi Prosek wrote:
> ----- Original Message -----
> > ----- Original Message -----
> > > 
> > > 
> > > On 03/02/2016 13:36, Amit Shah wrote:
> > > > ... and this can lead to breaking migration (the queue of requests on
> > > > the host needs to be migrated, else the new host will have no idea of
> > > > the queue).
> > > 
> > > It is already migrated as part of virtio_rng_save's call to virtio_save.
> > >  On the loading side, virtio_rng_process condenses all requests into one
> > > and chr_read fills in as many virtqueue buffers as possible from the
> > > single request.
> > 
> > Thanks! So this looks broken. The contract between virtio-rng and the rng
> > backend is "one chr_read callback per one rng_backend_request_entropy",
> > regardless of the request size. It guarantees that chr_read will get at
> > least one byte, nothing else. If the one condensed request is bigger than
> > what the backend is able to give at the moment, there may be unsatisfied
> > virtio buffers left in the queue.
> > 
> > One way of fixing this would be to have chr_read call virtio_rng_process
> > if it ran out of backend data but not out of virtio buffers. Otherwise
> > those buffers will be slowly drained by the rate limit timer, which is
> > not optimal (especially in the absence of rate limit :)
> 
> Basically, we could just revert this commit:
> 
> commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e
> Author: Amit Shah <amit.shah@redhat.com>
> Date:   Wed Nov 21 11:21:19 2012 +0530
> 
>     virtio-rng: remove extra request for entropy
>     
>     If we got fewer bytes from the backend than requested, don't poke the
>     backend for more bytes; the guest will ask for more (or if the guest has
>     already asked for more, the backend knows about it via handle_input()

OK, agreed.

It makes sense to revert this so the live migration scenario is sane.
Without live migration reverting this commit doesn't get us any
benefit, and that's why I had removed it.

Do you want to post a patch reverting this too?

Thanks,

		Amit
Ladi Prosek March 3, 2016, 9:30 a.m. UTC | #11
On Thu, Mar 3, 2016 at 6:05 AM, Amit Shah <amit.shah@redhat.com> wrote:
> On (Thu) 04 Feb 2016 [13:07:35], Ladi Prosek wrote:
>> ----- Original Message -----
>> > ----- Original Message -----
>> > >
>> > >
>> > > On 03/02/2016 13:36, Amit Shah wrote:
>> > > > ... and this can lead to breaking migration (the queue of requests on
>> > > > the host needs to be migrated, else the new host will have no idea of
>> > > > the queue).
>> > >
>> > > It is already migrated as part of virtio_rng_save's call to virtio_save.
>> > >  On the loading side, virtio_rng_process condenses all requests into one
>> > > and chr_read fills in as many virtqueue buffers as possible from the
>> > > single request.
>> >
>> > Thanks! So this looks broken. The contract between virtio-rng and the rng
>> > backend is "one chr_read callback per one rng_backend_request_entropy",
>> > regardless of the request size. It guarantees that chr_read will get at
>> > least one byte, nothing else. If the one condensed request is bigger than
>> > what the backend is able to give at the moment, there may be unsatisfied
>> > virtio buffers left in the queue.
>> >
>> > One way of fixing this would be to have chr_read call virtio_rng_process
>> > if it ran out of backend data but not out of virtio buffers. Otherwise
>> > those buffers will be slowly drained by the rate limit timer, which is
>> > not optimal (especially in the absence of rate limit :)
>>
>> Basically, we could just revert this commit:
>>
>> commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e
>> Author: Amit Shah <amit.shah@redhat.com>
>> Date:   Wed Nov 21 11:21:19 2012 +0530
>>
>>     virtio-rng: remove extra request for entropy
>>
>>     If we got fewer bytes from the backend than requested, don't poke the
>>     backend for more bytes; the guest will ask for more (or if the guest has
>>     already asked for more, the backend knows about it via handle_input()
>
> OK, agreed.
>
> It makes sense to revert this so the live migration scenario is sane.
> Without live migration reverting this commit doesn't get us any
> benefit, and that's why I had removed it.
>
> Do you want to post a patch reverting this too?

Yes, I've had this ready for some time, just haven't posted it yet
because it depends on this series. Without this series reverting the
commit completes a nice loop in the callgraph chr_read ->
virtio_rng_process -> rng_backend_request_entropy -> chr_read, leading
to stack overflow. I've actually hit it in testing.

> Thanks,
>
>                 Amit
diff mbox

Patch

diff --git a/backends/rng-random.c b/backends/rng-random.c
index 4e51f46..3f1fd21 100644
--- a/backends/rng-random.c
+++ b/backends/rng-random.c
@@ -22,10 +22,16 @@  struct RndRandom
     int fd;
     char *filename;
 
+    GSList *requests;
+};
+
+typedef struct RngRequest
+{
     EntropyReceiveFunc *receive_func;
     void *opaque;
     size_t size;
-};
+} RngRequest;
+
 
 /**
  * A simple and incomplete backend to request entropy from /dev/random.
@@ -37,19 +43,27 @@  struct RndRandom
 static void entropy_available(void *opaque)
 {
     RndRandom *s = RNG_RANDOM(opaque);
-    uint8_t buffer[s->size];
-    ssize_t len;
 
-    len = read(s->fd, buffer, s->size);
-    if (len < 0 && errno == EAGAIN) {
-        return;
-    }
-    g_assert(len != -1);
+    while (s->requests != NULL) {
+        RngRequest *req = s->requests->data;
+        uint8_t buffer[req->size];
+        ssize_t len;
+
+        len = read(s->fd, buffer, req->size);
+        if (len < 0 && errno == EAGAIN) {
+            return;
+        }
+        g_assert(len != -1);
 
-    s->receive_func(s->opaque, buffer, len);
-    s->receive_func = NULL;
+        req->receive_func(req->opaque, buffer, len);
 
-    qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
+        s->requests = g_slist_remove_link(s->requests, s->requests);
+        g_free(req);
+    }
+
+    if (s->requests == NULL) {
+        qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
+    }
 }
 
 static void rng_random_request_entropy(RngBackend *b, size_t size,
@@ -57,16 +71,36 @@  static void rng_random_request_entropy(RngBackend *b, size_t size,
                                         void *opaque)
 {
     RndRandom *s = RNG_RANDOM(b);
+    RngRequest *req;
 
-    if (s->receive_func) {
-        s->receive_func(s->opaque, NULL, 0);
+    req = g_malloc(sizeof(*req));
+
+    req->size = size;
+    req->receive_func = receive_entropy;
+    req->opaque = opaque;
+
+    s->requests = g_slist_append(s->requests, req);
+    if (s->requests->data == req) {
+        qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
+    }
+}
+
+static void rng_random_free_requests(RndRandom *s)
+{
+    GSList *i;
+
+    for (i = s->requests; i; i = i->next) {
+        g_free(i->data);
     }
 
-    s->receive_func = receive_entropy;
-    s->opaque = opaque;
-    s->size = size;
+    g_slist_free(s->requests);
+    s->requests = NULL;
+}
 
-    qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
+static void rng_random_cancel_requests(RngBackend *b)
+{
+    RndRandom *s = RNG_RANDOM(b);
+    rng_random_free_requests(s);
 }
 
 static void rng_random_opened(RngBackend *b, Error **errp)
@@ -129,6 +163,8 @@  static void rng_random_finalize(Object *obj)
     }
 
     g_free(s->filename);
+
+    rng_random_free_requests(s);
 }
 
 static void rng_random_class_init(ObjectClass *klass, void *data)
@@ -136,6 +172,7 @@  static void rng_random_class_init(ObjectClass *klass, void *data)
     RngBackendClass *rbc = RNG_BACKEND_CLASS(klass);
 
     rbc->request_entropy = rng_random_request_entropy;
+    rbc->cancel_requests = rng_random_cancel_requests;
     rbc->opened = rng_random_opened;
 }