Message ID | 20161025025431.24714-3-mreitz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 10/24/2016 09:54 PM, Max Reitz wrote: > While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that > the callback is supposed to return the number of bytes handled; what it > does not mention is that libcurl will throw an error if the callback did > not "handle" all of the data passed to it. > > Therefore, if the callback receives some data that it cannot handle > (either because the receive buffer has not been set up yet or because it > would not fit into the receive buffer) and we have to ignore it, we > still have to report that the data has been handled. > > Obviously, this should not happen normally. But it does happen at least > for FTP connections where some data (that we do not expect) may be > generated when the connection is established. Just to make sure, we aren't losing data by reporting this value, but merely letting curl know that our callback has "dealt" with the data, so that we don't error out, in order to get a second chance at the same data later on? Reviewed-by: Eric Blake <eblake@redhat.com> But given that it undoes 38bbc0a, I'd rather that it gets reviewed by Matthew and/or tested by Richard.
Am 25.10.2016 um 20:37 hat Eric Blake geschrieben: > On 10/24/2016 09:54 PM, Max Reitz wrote: > > While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that > > the callback is supposed to return the number of bytes handled; what it > > does not mention is that libcurl will throw an error if the callback did > > not "handle" all of the data passed to it. > > > > Therefore, if the callback receives some data that it cannot handle > > (either because the receive buffer has not been set up yet or because it > > would not fit into the receive buffer) and we have to ignore it, we > > still have to report that the data has been handled. > > > > Obviously, this should not happen normally. But it does happen at least > > for FTP connections where some data (that we do not expect) may be > > generated when the connection is established. > > Just to make sure, we aren't losing data by reporting this value, but > merely letting curl know that our callback has "dealt" with the data, so > that we don't error out, in order to get a second chance at the same > data later on? > > Reviewed-by: Eric Blake <eblake@redhat.com> > > But given that it undoes 38bbc0a, I'd rather that it gets reviewed by > Matthew and/or tested by Richard. In that case, I guess we should CC them. (Hereby done.) Kevin
On Tue, Oct 25, 2016 at 04:54:29AM +0200, Max Reitz wrote: > While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that > the callback is supposed to return the number of bytes handled; what it > does not mention is that libcurl will throw an error if the callback did > not "handle" all of the data passed to it. > > Therefore, if the callback receives some data that it cannot handle > (either because the receive buffer has not been set up yet or because it > would not fit into the receive buffer) and we have to ignore it, we > still have to report that the data has been handled. > > Obviously, this should not happen normally. But it does happen at least > for FTP connections where some data (that we do not expect) may be > generated when the connection is established. > > Cc: qemu-stable@nongnu.org > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > block/curl.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/block/curl.c b/block/curl.c > index 12afa15..095ffda 100644 > --- a/block/curl.c > +++ b/block/curl.c > @@ -212,12 +212,13 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque) > > DPRINTF("CURL: Just reading %zd bytes\n", realsize); > > - if (!s || !s->orig_buf) > - return 0; > + if (!s || !s->orig_buf) { > + goto read_end; > + } > > if (s->buf_off >= s->buf_len) { > /* buffer full, read nothing */ > - return 0; > + goto read_end; > } > realsize = MIN(realsize, s->buf_len - s->buf_off); > memcpy(s->orig_buf + s->buf_off, ptr, realsize); > @@ -238,7 +239,9 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque) > } > } > > - return realsize; > +read_end: > + /* curl will error out if we do not return this value */ > + return size * nmemb; > } Luckily I documented my test case last time: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg04883.html I went back and tried a similar test case with just this patch on top of current qemu from git, and it doesn't appear to break anything for the http case. So it's fine as far as I can see. Rich. $ http_proxy= LIBGUESTFS_BACKEND=direct LIBGUESTFS_HV=~/d/qemu/x86_64-softmmu/qemu-system-x86_64 guestfish -a http://onuma.home.annexia.org/media/installers/Fedora-23-Cloud-x86_64/Fedora-Cloud-Base-23-20151030.x86_64.qcow2 --ro -i Welcome to guestfish, the guest filesystem shell for editing virtual machine filesystems and disk images. Type: 'help' for help on commands 'man' to read the manual 'quit' to quit the shell Operating system: Fedora 23 (Cloud Edition) /dev/sda1 mounted on / ><fs> ll / total 84 dr-xr-xr-x. 18 root root 4096 Oct 30 2015 . drwxr-xr-x 19 root root 4096 Oct 26 09:36 .. lrwxrwxrwx. 1 root root 7 Sep 10 2015 bin -> usr/bin dr-xr-xr-x. 5 root root 4096 Oct 30 2015 boot drwxr-xr-x. 2 root root 4096 Oct 30 2015 dev drwxr-xr-x. 68 root root 4096 Oct 30 2015 etc drwxr-xr-x. 2 root root 4096 Oct 30 2015 home lrwxrwxrwx. 1 root root 7 Sep 10 2015 lib -> usr/lib lrwxrwxrwx. 1 root root 9 Sep 10 2015 lib64 -> usr/lib64 drwx------. 2 root root 16384 Oct 30 2015 lost+found drwxr-xr-x. 2 root root 4096 Sep 10 2015 media drwxr-xr-x. 2 root root 4096 Sep 10 2015 mnt drwxr-xr-x. 2 root root 4096 Sep 10 2015 opt drwxr-xr-x. 2 root root 4096 Oct 30 2015 proc dr-xr-x---. 2 root root 4096 Oct 30 2015 root drwxr-xr-x. 2 root root 4096 Oct 30 2015 run lrwxrwxrwx. 1 root root 8 Sep 10 2015 sbin -> usr/sbin drwxr-xr-x. 2 root root 4096 Sep 10 2015 srv drwxr-xr-x. 2 root root 4096 Oct 30 2015 sys drwxrwxrwt. 7 root root 4096 Oct 30 2015 tmp drwxr-xr-x. 12 root root 4096 Oct 30 2015 usr drwxr-xr-x. 18 root root 4096 Oct 30 2015 var ><fs> find / | wc -l 26532 ><fs>
On 25.10.2016 20:37, Eric Blake wrote: > On 10/24/2016 09:54 PM, Max Reitz wrote: >> While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that >> the callback is supposed to return the number of bytes handled; what it >> does not mention is that libcurl will throw an error if the callback did >> not "handle" all of the data passed to it. >> >> Therefore, if the callback receives some data that it cannot handle >> (either because the receive buffer has not been set up yet or because it >> would not fit into the receive buffer) and we have to ignore it, we >> still have to report that the data has been handled. >> >> Obviously, this should not happen normally. But it does happen at least >> for FTP connections where some data (that we do not expect) may be >> generated when the connection is established. > > Just to make sure, we aren't losing data by reporting this value, but > merely letting curl know that our callback has "dealt" with the data, so > that we don't error out, in order to get a second chance at the same > data later on? I don't know about previous or future versions of libcurl, but I looked at the current git code and it really just does two things with the return code: (1) Check whether it's a special code for pausing the connection, if so, do what's necessary. (2) If it is not, check whether it's different from the number of bytes passed: If it is, report an error and abort the original transfer (which in the case of establishing the connection means basically dropping the connection). Thus, curl will never save the data we didn't handle for later or something like that. It's just an error if we don't report to have handled everything. This is also supported by the man page CURLOPT_WRITEFUNCTION(3): > Your callback should return the number of bytes actually taken care > of. If that amount differs from the amount passed to your callback > function, it'll signal an error condition to the library. This will > cause the transfer to get aborted and the libcurl function used will > return CURLE_WRITE_ERROR. Therefore, I don't think this is going to be changed in the future. Max > Reviewed-by: Eric Blake <eblake@redhat.com> > > But given that it undoes 38bbc0a, I'd rather that it gets reviewed by > Matthew and/or tested by Richard. >
I've long since lost all context on this patch, so I'll trust you :) Thanks for the ping, Matt On Wed, Oct 26, 2016 at 10:17 AM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 25.10.2016 um 20:37 hat Eric Blake geschrieben: > > On 10/24/2016 09:54 PM, Max Reitz wrote: > > > While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in > that > > > the callback is supposed to return the number of bytes handled; what it > > > does not mention is that libcurl will throw an error if the callback > did > > > not "handle" all of the data passed to it. > > > > > > Therefore, if the callback receives some data that it cannot handle > > > (either because the receive buffer has not been set up yet or because > it > > > would not fit into the receive buffer) and we have to ignore it, we > > > still have to report that the data has been handled. > > > > > > Obviously, this should not happen normally. But it does happen at least > > > for FTP connections where some data (that we do not expect) may be > > > generated when the connection is established. > > > > Just to make sure, we aren't losing data by reporting this value, but > > merely letting curl know that our callback has "dealt" with the data, so > > that we don't error out, in order to get a second chance at the same > > data later on? > > > > Reviewed-by: Eric Blake <eblake@redhat.com> > > > > But given that it undoes 38bbc0a, I'd rather that it gets reviewed by > > Matthew and/or tested by Richard. > > In that case, I guess we should CC them. (Hereby done.) > > Kevin >
diff --git a/block/curl.c b/block/curl.c index 12afa15..095ffda 100644 --- a/block/curl.c +++ b/block/curl.c @@ -212,12 +212,13 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque) DPRINTF("CURL: Just reading %zd bytes\n", realsize); - if (!s || !s->orig_buf) - return 0; + if (!s || !s->orig_buf) { + goto read_end; + } if (s->buf_off >= s->buf_len) { /* buffer full, read nothing */ - return 0; + goto read_end; } realsize = MIN(realsize, s->buf_len - s->buf_off); memcpy(s->orig_buf + s->buf_off, ptr, realsize); @@ -238,7 +239,9 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque) } } - return realsize; +read_end: + /* curl will error out if we do not return this value */ + return size * nmemb; } static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that the callback is supposed to return the number of bytes handled; what it does not mention is that libcurl will throw an error if the callback did not "handle" all of the data passed to it. Therefore, if the callback receives some data that it cannot handle (either because the receive buffer has not been set up yet or because it would not fit into the receive buffer) and we have to ignore it, we still have to report that the data has been handled. Obviously, this should not happen normally. But it does happen at least for FTP connections where some data (that we do not expect) may be generated when the connection is established. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> --- block/curl.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)