diff mbox

[2/2] ubifs: Allow O_DIRECT

Message ID x49wpwjbh7p.fsf@segfault.boston.devel.redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Moyer Aug. 25, 2015, 2 p.m. UTC
Dave Chinner <david@fromorbit.com> writes:

> On Mon, Aug 24, 2015 at 01:19:24PM -0400, Jeff Moyer wrote:
>> Brian Norris <computersforpeace@gmail.com> writes:
>> 
>> > On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote:
>> >> Now, some user-space fails when direct I/O is not supported.
>> >
>> > I think the whole argument rested on what it means when "some user space
>> > fails"; apparently that "user space" is just a test suite (which
>> > can/should be fixed).
>> 
>> Even if it wasn't a test suite it should still fail.  Either the fs
>> supports O_DIRECT or it doesn't.  Right now, the only way an application
>> can figure this out is to try an open and see if it fails.  Don't break
>> that.
>
> Who cares how a filesystem implements O_DIRECT as long as it does
> not corrupt data? ext3 fell back to buffered IO in many situations,
> yet the only complaints about that were performance. IOWs, it's long been
> true that if the user cares about O_DIRECT *performance* then they
> have to be careful about their choice of filesystem.

> But if it's only 5 lines of code per filesystem to support O_DIRECT
> *correctly* via buffered IO, then exactly why should userspace have
> to jump through hoops to explicitly handle open(O_DIRECT) failure?

> Especially when you consider that all they can do is fall back to
> buffered IO themselves....

I had written counterpoints for all of this, but I thought better of
it.  Old versions of the kernel simply ignore O_DIRECT, so clearly
there's precedent.

I do think we should at least document what file systems appear to be
doing.  Here's a man page patch for open (generated with extra context
for easier reading).  Let me know what you think.

Cheers,
Jeff

p.s. I still think it's the wrong way to go, as it makes it harder for
     an admin to determine what is actually going on.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Chris Mason Aug. 25, 2015, 2:13 p.m. UTC | #1
On Tue, Aug 25, 2015 at 10:00:58AM -0400, Jeff Moyer wrote:
> Dave Chinner <david@fromorbit.com> writes:
> 
> > On Mon, Aug 24, 2015 at 01:19:24PM -0400, Jeff Moyer wrote:
> >> Brian Norris <computersforpeace@gmail.com> writes:
> >> 
> >> > On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote:
> >> >> Now, some user-space fails when direct I/O is not supported.
> >> >
> >> > I think the whole argument rested on what it means when "some user space
> >> > fails"; apparently that "user space" is just a test suite (which
> >> > can/should be fixed).
> >> 
> >> Even if it wasn't a test suite it should still fail.  Either the fs
> >> supports O_DIRECT or it doesn't.  Right now, the only way an application
> >> can figure this out is to try an open and see if it fails.  Don't break
> >> that.
> >
> > Who cares how a filesystem implements O_DIRECT as long as it does
> > not corrupt data? ext3 fell back to buffered IO in many situations,
> > yet the only complaints about that were performance. IOWs, it's long been
> > true that if the user cares about O_DIRECT *performance* then they
> > have to be careful about their choice of filesystem.
> 
> > But if it's only 5 lines of code per filesystem to support O_DIRECT
> > *correctly* via buffered IO, then exactly why should userspace have
> > to jump through hoops to explicitly handle open(O_DIRECT) failure?
> 
> > Especially when you consider that all they can do is fall back to
> > buffered IO themselves....
> 
> I had written counterpoints for all of this, but I thought better of
> it.  Old versions of the kernel simply ignore O_DIRECT, so clearly
> there's precedent.
> 
> I do think we should at least document what file systems appear to be
> doing.  Here's a man page patch for open (generated with extra context
> for easier reading).  Let me know what you think.

We shouldn't be ignoring it, but instead call it similar to O_DSYNC plus
removing the pages from cache.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Moyer Aug. 25, 2015, 2:18 p.m. UTC | #2
Chris Mason <clm@fb.com> writes:

>> I do think we should at least document what file systems appear to be
>> doing.  Here's a man page patch for open (generated with extra context
>> for easier reading).  Let me know what you think.
>
> We shouldn't be ignoring it, but instead call it similar to O_DSYNC plus
> removing the pages from cache.

Ah, right.  I'll fix that up, thanks.

-Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/man2/open.2 b/man2/open.2
index 06c0a29..acc438b 100644
--- a/man2/open.2
+++ b/man2/open.2
@@ -1471,17 +1471,18 @@  a flag of the same name, but without alignment restrictions.
 .LP
 .B O_DIRECT
 support was added under Linux in kernel version 2.4.10.
 Older Linux kernels simply ignore this flag.
 Some filesystems may not implement the flag and
 .BR open ()
 will fail with
 .B EINVAL
-if it is used.
+if it is used.  Other file systems may implement O_DIRECT via
+buffered I/O, which is essentially the same as ignoring the flag.
 .LP
 Applications should avoid mixing
 .B O_DIRECT
 and normal I/O to the same file,
 and especially to overlapping byte regions in the same file.
 Even when the filesystem correctly handles the coherency issues in
 this situation, overall I/O throughput is likely to be slower than
 using either mode alone.