[2/6] vfs: vfs: Define new syscalls preadv2,pwritev2
diff mbox

Message ID 20160311095357.GA29350@lst.de
State New
Headers show

Commit Message

Christoph Hellwig March 11, 2016, 9:53 a.m. UTC
On Thu, Mar 10, 2016 at 07:15:04PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Christoph,
> 
> On 03/03/2016 04:03 PM, Christoph Hellwig wrote:
> > From: Milosz Tanski <milosz@adfin.com>
> > 
> > New syscalls that take an flag argument.   No flags are added yet in this
> > patch.
> 
> Are there some man pages patches for these proposed system calls?

This is what I have:

---
From d33a02d56f447a6cb223b3964e1dd894f2921d5c Mon Sep 17 00:00:00 2001
From: Milosz Tanski <milosz@adfin.com>
Date: Fri, 11 Mar 2016 10:52:31 +0100
Subject: add preadv2/pwritev2 documentation

New syscalls that are a variation on the preadv/pwritev but support an extra
flag argument.

Signed-off-by: Milosz Tanski <milosz@adfin.com>
[hch: added RWF_HIPRI documentation]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 man2/readv.2 | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 54 insertions(+), 9 deletions(-)

Comments

Michael Kerrisk (man-pages) April 18, 2016, 1:51 p.m. UTC | #1
Hello Christoph,

On 03/11/2016 09:53 AM, Christoph Hellwig wrote:
> On Thu, Mar 10, 2016 at 07:15:04PM +0100, Michael Kerrisk (man-pages) wrote:
>> Hi Christoph,
>>
>> On 03/03/2016 04:03 PM, Christoph Hellwig wrote:
>>> From: Milosz Tanski <milosz@adfin.com>
>>>
>>> New syscalls that take an flag argument.   No flags are added yet in this
>>> patch.
>>
>> Are there some man pages patches for these proposed system calls?
> 
> This is what I have:

Thanks. I applied the patch, but I see one point where the doc
and code differ, and I suspect that the code needs to be fixed.
See below.

> ---
>>From d33a02d56f447a6cb223b3964e1dd894f2921d5c Mon Sep 17 00:00:00 2001
> From: Milosz Tanski <milosz@adfin.com>
> Date: Fri, 11 Mar 2016 10:52:31 +0100
> Subject: add preadv2/pwritev2 documentation
> 
> New syscalls that are a variation on the preadv/pwritev but support an extra
> flag argument.
> 
> Signed-off-by: Milosz Tanski <milosz@adfin.com>
> [hch: added RWF_HIPRI documentation]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  man2/readv.2 | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 54 insertions(+), 9 deletions(-)
> 
> diff --git a/man2/readv.2 b/man2/readv.2
> index 93f2b6f..5cba5e2 100644
> --- a/man2/readv.2
> +++ b/man2/readv.2
> @@ -45,6 +45,12 @@ readv, writev, preadv, pwritev \- read or write data into multiple buffers
>  .sp
>  .BI "ssize_t pwritev(int " fd ", const struct iovec *" iov ", int " iovcnt ,
>  .BI "                off_t " offset );
> +.sp
> +.BI "ssize_t preadv2(int " fd ", const struct iovec *" iov ", int " iovcnt ,
> +.BI "                off_t " offset ", int " flags );
> +.sp
> +.BI "ssize_t pwritev2(int " fd ", const struct iovec *" iov ", int " iovcnt ,
> +.BI "                 off_t " offset ", int " flags );
>  .fi
>  .sp
>  .in -4n
> @@ -166,9 +172,9 @@ The
>  system call combines the functionality of
>  .BR writev ()
>  and
> -.BR pwrite (2).
> +.BR pwrite (2) "."
>  It performs the same task as
> -.BR writev (),
> +.BR writev () ","
>  but adds a fourth argument,
>  .IR offset ,
>  which specifies the file offset at which the output operation
> @@ -178,15 +184,43 @@ The file offset is not changed by these system calls.
>  The file referred to by
>  .I fd
>  must be capable of seeking.
> +.SS preadv2() and pwritev2()
> +
> +This pair of system calls has similar functionality to the
> +.BR preadv ()
> +and
> +.BR pwritev ()
> +calls, but adds a fifth argument, \fIflags\fP, which modifies the behavior on a per call basis.
> +
> +Like the
> +.BR preadv ()
> +and
> +.BR pwritev ()
> +calls, they accept an \fIoffset\fP argument. Unlike those calls, if the \fIoffset\fP argument is set to -1 then the current file offset is used and updated.
> +
> +The \fIflags\fP arguments to
> +.BR preadv2 ()
> +and
> +.BR pwritev2 ()
> +contains a bitwise OR of one or more of the following flags:
> +.TP
> +.BR RWF_HIPRI " (since Linux 4.6)"
> +High priority read/write.  Allows block based filesystems to use polling of the
> +device, which provides lower latency, but may use additional ressources.  (Currently
> +only usable on a file descriptor opened using the
> +.BR O_DIRECT " flag)."
> +
>  .SH RETURN VALUE
>  On success,
> -.BR readv ()
> -and
> +.BR readv () ","
>  .BR preadv ()
> -return the number of bytes read;
> -.BR writev ()
>  and
> +.BR preadv2 ()
> +return the number of bytes read;
> +.BR writev () ","
>  .BR pwritev ()
> +and
> +.BR pwritev2 ()
>  return the number of bytes written.
>  
>  Note that is not an error for a successful call to transfer fewer bytes
> @@ -202,9 +236,11 @@ The errors are as given for
>  and
>  .BR write (2).
>  Furthermore,
> -.BR preadv ()
> -and
> +.BR preadv () ","
> +.BR preadv2 () ","
>  .BR pwritev ()
> +and
> +.BR pwritev2 ()
>  can also fail for the same reasons as
>  .BR lseek (2).
>  Additionally, the following error is defined:
> @@ -218,12 +254,17 @@ value.
>  .TP
>  .B EINVAL
>  The vector count \fIiovcnt\fP is less than zero or greater than the
> -permitted maximum.
> +permitted maximum. Or, an unknown flag is specified in \fIflags\fP.

In the case described in the last sentence, the code currently appears
to be returning EOPNOTSUPP. EINVAL is more usual here, so I think the
code needs adjusting. Your thoughts?

Cheers,

Michael
hch@infradead.org April 25, 2016, 8:47 a.m. UTC | #2
On Mon, Apr 18, 2016 at 02:51:50PM +0100, Michael Kerrisk (man-pages) wrote:
> Thanks. I applied the patch, but I see one point where the doc
> and code differ, and I suspect that the code needs to be fixed.
> See below.

> >  .TP
> >  .B EINVAL
> >  The vector count \fIiovcnt\fP is less than zero or greater than the
> > -permitted maximum.
> > +permitted maximum. Or, an unknown flag is specified in \fIflags\fP.
> 
> In the case described in the last sentence, the code currently appears
> to be returning EOPNOTSUPP. EINVAL is more usual here, so I think the
> code needs adjusting. Your thoughts?

I'd rather update the man page - EOPNOTSUPP is a much more descriptive
error code for this case.  I'll send you a patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Kerrisk (man-pages) April 25, 2016, 5:35 p.m. UTC | #3
Hi Christoph,

On 25 April 2016 at 10:47, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Apr 18, 2016 at 02:51:50PM +0100, Michael Kerrisk (man-pages) wrote:
>> Thanks. I applied the patch, but I see one point where the doc
>> and code differ, and I suspect that the code needs to be fixed.
>> See below.
>
>> >  .TP
>> >  .B EINVAL
>> >  The vector count \fIiovcnt\fP is less than zero or greater than the
>> > -permitted maximum.
>> > +permitted maximum. Or, an unknown flag is specified in \fIflags\fP.
>>
>> In the case described in the last sentence, the code currently appears
>> to be returning EOPNOTSUPP. EINVAL is more usual here, so I think the
>> code needs adjusting. Your thoughts?
>
> I'd rather update the man page - EOPNOTSUPP is a much more descriptive
> error code for this case.  I'll send you a patch.

Unless I'm misunderstanding something here, you're proposing something
very inconsistent. The standard error for unknown flag bits is EINVAL.
This is so for dozens of systems calls (check the man pages; you might
find a rare exception, but that's the point, they are exceptions). It
seems to me here that it's really the implementation that needs
fixing, not the man page!

Cheers,

Michael
hch@infradead.org May 8, 2016, 9:29 a.m. UTC | #4
On Mon, Apr 25, 2016 at 07:35:36PM +0200, Michael Kerrisk (man-pages) wrote:
> > I'd rather update the man page - EOPNOTSUPP is a much more descriptive
> > error code for this case.  I'll send you a patch.
> 
> Unless I'm misunderstanding something here, you're proposing something
> very inconsistent. The standard error for unknown flag bits is EINVAL.
> This is so for dozens of systems calls (check the man pages; you might
> find a rare exception, but that's the point, they are exceptions). It
> seems to me here that it's really the implementation that needs
> fixing, not the man page!

For new filesystem calls we try to use EOPNOTSUPP as much as possible,
e.g. fallocate.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/man2/readv.2 b/man2/readv.2
index 93f2b6f..5cba5e2 100644
--- a/man2/readv.2
+++ b/man2/readv.2
@@ -45,6 +45,12 @@  readv, writev, preadv, pwritev \- read or write data into multiple buffers
 .sp
 .BI "ssize_t pwritev(int " fd ", const struct iovec *" iov ", int " iovcnt ,
 .BI "                off_t " offset );
+.sp
+.BI "ssize_t preadv2(int " fd ", const struct iovec *" iov ", int " iovcnt ,
+.BI "                off_t " offset ", int " flags );
+.sp
+.BI "ssize_t pwritev2(int " fd ", const struct iovec *" iov ", int " iovcnt ,
+.BI "                 off_t " offset ", int " flags );
 .fi
 .sp
 .in -4n
@@ -166,9 +172,9 @@  The
 system call combines the functionality of
 .BR writev ()
 and
-.BR pwrite (2).
+.BR pwrite (2) "."
 It performs the same task as
-.BR writev (),
+.BR writev () ","
 but adds a fourth argument,
 .IR offset ,
 which specifies the file offset at which the output operation
@@ -178,15 +184,43 @@  The file offset is not changed by these system calls.
 The file referred to by
 .I fd
 must be capable of seeking.
+.SS preadv2() and pwritev2()
+
+This pair of system calls has similar functionality to the
+.BR preadv ()
+and
+.BR pwritev ()
+calls, but adds a fifth argument, \fIflags\fP, which modifies the behavior on a per call basis.
+
+Like the
+.BR preadv ()
+and
+.BR pwritev ()
+calls, they accept an \fIoffset\fP argument. Unlike those calls, if the \fIoffset\fP argument is set to -1 then the current file offset is used and updated.
+
+The \fIflags\fP arguments to
+.BR preadv2 ()
+and
+.BR pwritev2 ()
+contains a bitwise OR of one or more of the following flags:
+.TP
+.BR RWF_HIPRI " (since Linux 4.6)"
+High priority read/write.  Allows block based filesystems to use polling of the
+device, which provides lower latency, but may use additional ressources.  (Currently
+only usable on a file descriptor opened using the
+.BR O_DIRECT " flag)."
+
 .SH RETURN VALUE
 On success,
-.BR readv ()
-and
+.BR readv () ","
 .BR preadv ()
-return the number of bytes read;
-.BR writev ()
 and
+.BR preadv2 ()
+return the number of bytes read;
+.BR writev () ","
 .BR pwritev ()
+and
+.BR pwritev2 ()
 return the number of bytes written.
 
 Note that is not an error for a successful call to transfer fewer bytes
@@ -202,9 +236,11 @@  The errors are as given for
 and
 .BR write (2).
 Furthermore,
-.BR preadv ()
-and
+.BR preadv () ","
+.BR preadv2 () ","
 .BR pwritev ()
+and
+.BR pwritev2 ()
 can also fail for the same reasons as
 .BR lseek (2).
 Additionally, the following error is defined:
@@ -218,12 +254,17 @@  value.
 .TP
 .B EINVAL
 The vector count \fIiovcnt\fP is less than zero or greater than the
-permitted maximum.
+permitted maximum. Or, an unknown flag is specified in \fIflags\fP.
 .SH VERSIONS
 .BR preadv ()
 and
 .BR pwritev ()
 first appeared in Linux 2.6.30; library support was added in glibc 2.10.
+.sp
+.BR preadv2 ()
+and
+.BR pwritev2 ()
+first appeared in Linux 4.6
 .SH CONFORMING TO
 .BR readv (),
 .BR writev ():
@@ -237,6 +278,10 @@  POSIX.1-2001, POSIX.1-2008,
 .BR preadv (),
 .BR pwritev ():
 nonstandard, but present also on the modern BSDs.
+.sp
+.BR preadv2 (),
+.BR pwritev2 ():
+nonstandard, Linux extension.
 .SH NOTES
 POSIX.1 allows an implementation to place a limit on
 the number of items that can be passed in