[2/4] block/curl: Fix return value from curl_read_cb
diff mbox

Message ID 20161025025431.24714-3-mreitz@redhat.com
State New
Headers show

Commit Message

Max Reitz Oct. 25, 2016, 2:54 a.m. UTC
While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
the callback is supposed to return the number of bytes handled; what it
does not mention is that libcurl will throw an error if the callback did
not "handle" all of the data passed to it.

Therefore, if the callback receives some data that it cannot handle
(either because the receive buffer has not been set up yet or because it
would not fit into the receive buffer) and we have to ignore it, we
still have to report that the data has been handled.

Obviously, this should not happen normally. But it does happen at least
for FTP connections where some data (that we do not expect) may be
generated when the connection is established.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

Comments

Eric Blake Oct. 25, 2016, 6:37 p.m. UTC | #1
On 10/24/2016 09:54 PM, Max Reitz wrote:
> While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
> the callback is supposed to return the number of bytes handled; what it
> does not mention is that libcurl will throw an error if the callback did
> not "handle" all of the data passed to it.
> 
> Therefore, if the callback receives some data that it cannot handle
> (either because the receive buffer has not been set up yet or because it
> would not fit into the receive buffer) and we have to ignore it, we
> still have to report that the data has been handled.
> 
> Obviously, this should not happen normally. But it does happen at least
> for FTP connections where some data (that we do not expect) may be
> generated when the connection is established.

Just to make sure, we aren't losing data by reporting this value, but
merely letting curl know that our callback has "dealt" with the data, so
that we don't error out, in order to get a second chance at the same
data later on?

Reviewed-by: Eric Blake <eblake@redhat.com>

But given that it undoes 38bbc0a, I'd rather that it gets reviewed by
Matthew and/or tested by Richard.
Kevin Wolf Oct. 26, 2016, 9:17 a.m. UTC | #2
Am 25.10.2016 um 20:37 hat Eric Blake geschrieben:
> On 10/24/2016 09:54 PM, Max Reitz wrote:
> > While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
> > the callback is supposed to return the number of bytes handled; what it
> > does not mention is that libcurl will throw an error if the callback did
> > not "handle" all of the data passed to it.
> > 
> > Therefore, if the callback receives some data that it cannot handle
> > (either because the receive buffer has not been set up yet or because it
> > would not fit into the receive buffer) and we have to ignore it, we
> > still have to report that the data has been handled.
> > 
> > Obviously, this should not happen normally. But it does happen at least
> > for FTP connections where some data (that we do not expect) may be
> > generated when the connection is established.
> 
> Just to make sure, we aren't losing data by reporting this value, but
> merely letting curl know that our callback has "dealt" with the data, so
> that we don't error out, in order to get a second chance at the same
> data later on?
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> But given that it undoes 38bbc0a, I'd rather that it gets reviewed by
> Matthew and/or tested by Richard.

In that case, I guess we should CC them. (Hereby done.)

Kevin
Richard W.M. Jones Oct. 26, 2016, 9:38 a.m. UTC | #3
On Tue, Oct 25, 2016 at 04:54:29AM +0200, Max Reitz wrote:
> While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
> the callback is supposed to return the number of bytes handled; what it
> does not mention is that libcurl will throw an error if the callback did
> not "handle" all of the data passed to it.
> 
> Therefore, if the callback receives some data that it cannot handle
> (either because the receive buffer has not been set up yet or because it
> would not fit into the receive buffer) and we have to ignore it, we
> still have to report that the data has been handled.
> 
> Obviously, this should not happen normally. But it does happen at least
> for FTP connections where some data (that we do not expect) may be
> generated when the connection is established.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 12afa15..095ffda 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -212,12 +212,13 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
>  
>      DPRINTF("CURL: Just reading %zd bytes\n", realsize);
>  
> -    if (!s || !s->orig_buf)
> -        return 0;
> +    if (!s || !s->orig_buf) {
> +        goto read_end;
> +    }
>  
>      if (s->buf_off >= s->buf_len) {
>          /* buffer full, read nothing */
> -        return 0;
> +        goto read_end;
>      }
>      realsize = MIN(realsize, s->buf_len - s->buf_off);
>      memcpy(s->orig_buf + s->buf_off, ptr, realsize);
> @@ -238,7 +239,9 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
>          }
>      }
>  
> -    return realsize;
> +read_end:
> +    /* curl will error out if we do not return this value */
> +    return size * nmemb;
>  }

Luckily I documented my test case last time:
https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg04883.html

I went back and tried a similar test case with just this patch on top
of current qemu from git, and it doesn't appear to break anything for
the http case.

So it's fine as far as I can see.

Rich.

$ http_proxy= LIBGUESTFS_BACKEND=direct LIBGUESTFS_HV=~/d/qemu/x86_64-softmmu/qemu-system-x86_64 guestfish -a http://onuma.home.annexia.org/media/installers/Fedora-23-Cloud-x86_64/Fedora-Cloud-Base-23-20151030.x86_64.qcow2 --ro -i

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: 'help' for help on commands
      'man' to read the manual
      'quit' to quit the shell

Operating system: Fedora 23 (Cloud Edition)
/dev/sda1 mounted on /

><fs> ll /
total 84
dr-xr-xr-x. 18 root root  4096 Oct 30  2015 .
drwxr-xr-x  19 root root  4096 Oct 26 09:36 ..
lrwxrwxrwx.  1 root root     7 Sep 10  2015 bin -> usr/bin
dr-xr-xr-x.  5 root root  4096 Oct 30  2015 boot
drwxr-xr-x.  2 root root  4096 Oct 30  2015 dev
drwxr-xr-x. 68 root root  4096 Oct 30  2015 etc
drwxr-xr-x.  2 root root  4096 Oct 30  2015 home
lrwxrwxrwx.  1 root root     7 Sep 10  2015 lib -> usr/lib
lrwxrwxrwx.  1 root root     9 Sep 10  2015 lib64 -> usr/lib64
drwx------.  2 root root 16384 Oct 30  2015 lost+found
drwxr-xr-x.  2 root root  4096 Sep 10  2015 media
drwxr-xr-x.  2 root root  4096 Sep 10  2015 mnt
drwxr-xr-x.  2 root root  4096 Sep 10  2015 opt
drwxr-xr-x.  2 root root  4096 Oct 30  2015 proc
dr-xr-x---.  2 root root  4096 Oct 30  2015 root
drwxr-xr-x.  2 root root  4096 Oct 30  2015 run
lrwxrwxrwx.  1 root root     8 Sep 10  2015 sbin -> usr/sbin
drwxr-xr-x.  2 root root  4096 Sep 10  2015 srv
drwxr-xr-x.  2 root root  4096 Oct 30  2015 sys
drwxrwxrwt.  7 root root  4096 Oct 30  2015 tmp
drwxr-xr-x. 12 root root  4096 Oct 30  2015 usr
drwxr-xr-x. 18 root root  4096 Oct 30  2015 var

><fs> find / | wc -l
26532
><fs>
Max Reitz Oct. 26, 2016, 2:43 p.m. UTC | #4
On 25.10.2016 20:37, Eric Blake wrote:
> On 10/24/2016 09:54 PM, Max Reitz wrote:
>> While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
>> the callback is supposed to return the number of bytes handled; what it
>> does not mention is that libcurl will throw an error if the callback did
>> not "handle" all of the data passed to it.
>>
>> Therefore, if the callback receives some data that it cannot handle
>> (either because the receive buffer has not been set up yet or because it
>> would not fit into the receive buffer) and we have to ignore it, we
>> still have to report that the data has been handled.
>>
>> Obviously, this should not happen normally. But it does happen at least
>> for FTP connections where some data (that we do not expect) may be
>> generated when the connection is established.
> 
> Just to make sure, we aren't losing data by reporting this value, but
> merely letting curl know that our callback has "dealt" with the data, so
> that we don't error out, in order to get a second chance at the same
> data later on?

I don't know about previous or future versions of libcurl, but I looked
at the current git code and it really just does two things with the
return code:

(1) Check whether it's a special code for pausing the connection, if so,
do what's necessary.

(2) If it is not, check whether it's different from the number of bytes
passed: If it is, report an error and abort the original transfer (which
in the case of establishing the connection means basically dropping the
connection).

Thus, curl will never save the data we didn't handle for later or
something like that. It's just an error if we don't report to have
handled everything.


This is also supported by the man page CURLOPT_WRITEFUNCTION(3):

> Your callback should return the number of bytes actually taken care
> of. If that amount differs from the amount passed to your callback
> function, it'll signal an error condition to the library. This will
> cause the transfer to get aborted and the libcurl function used will
> return CURLE_WRITE_ERROR.

Therefore, I don't think this is going to be changed in the future.

Max

> Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> But given that it undoes 38bbc0a, I'd rather that it gets reviewed by
> Matthew and/or tested by Richard.
>
Matthew Booth Nov. 1, 2016, 9:58 a.m. UTC | #5
I've long since lost all context on this patch, so I'll trust you :)

Thanks for the ping,

Matt

On Wed, Oct 26, 2016 at 10:17 AM, Kevin Wolf <kwolf@redhat.com> wrote:

> Am 25.10.2016 um 20:37 hat Eric Blake geschrieben:
> > On 10/24/2016 09:54 PM, Max Reitz wrote:
> > > While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in
> that
> > > the callback is supposed to return the number of bytes handled; what it
> > > does not mention is that libcurl will throw an error if the callback
> did
> > > not "handle" all of the data passed to it.
> > >
> > > Therefore, if the callback receives some data that it cannot handle
> > > (either because the receive buffer has not been set up yet or because
> it
> > > would not fit into the receive buffer) and we have to ignore it, we
> > > still have to report that the data has been handled.
> > >
> > > Obviously, this should not happen normally. But it does happen at least
> > > for FTP connections where some data (that we do not expect) may be
> > > generated when the connection is established.
> >
> > Just to make sure, we aren't losing data by reporting this value, but
> > merely letting curl know that our callback has "dealt" with the data, so
> > that we don't error out, in order to get a second chance at the same
> > data later on?
> >
> > Reviewed-by: Eric Blake <eblake@redhat.com>
> >
> > But given that it undoes 38bbc0a, I'd rather that it gets reviewed by
> > Matthew and/or tested by Richard.
>
> In that case, I guess we should CC them. (Hereby done.)
>
> Kevin
>

Patch
diff mbox

diff --git a/block/curl.c b/block/curl.c
index 12afa15..095ffda 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -212,12 +212,13 @@  static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 
     DPRINTF("CURL: Just reading %zd bytes\n", realsize);
 
-    if (!s || !s->orig_buf)
-        return 0;
+    if (!s || !s->orig_buf) {
+        goto read_end;
+    }
 
     if (s->buf_off >= s->buf_len) {
         /* buffer full, read nothing */
-        return 0;
+        goto read_end;
     }
     realsize = MIN(realsize, s->buf_len - s->buf_off);
     memcpy(s->orig_buf + s->buf_off, ptr, realsize);
@@ -238,7 +239,9 @@  static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
         }
     }
 
-    return realsize;
+read_end:
+    /* curl will error out if we do not return this value */
+    return size * nmemb;
 }
 
 static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,