diff mbox series

[v2,28/38] qapi/gen.py: update write() to be more idiomatic

Message ID 20200922210101.4081073-29-jsnow@redhat.com (mailing list archive)
State New, archived
Headers show
Series qapi: static typing conversion, pt1 | expand

Commit Message

John Snow Sept. 22, 2020, 9 p.m. UTC
Make the file handling here just a tiny bit more idiomatic.
(I realize this is heavily subjective.)

Use exist_ok=True for os.makedirs and remove the exception,
use fdopen() to wrap the file descriptor in a File-like object,
and use a context manager for managing the file pointer.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/gen.py | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

Comments

Eduardo Habkost Sept. 23, 2020, 3:26 p.m. UTC | #1
On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
> Make the file handling here just a tiny bit more idiomatic.
> (I realize this is heavily subjective.)
> 
> Use exist_ok=True for os.makedirs and remove the exception,
> use fdopen() to wrap the file descriptor in a File-like object,
> and use a context manager for managing the file pointer.
> 
> Signed-off-by: John Snow <jsnow@redhat.com>

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

I really miss a comment below explaining why we use
open(os.open(pathname, ...), ...) instead of open(pathname, ...).

> ---
>  scripts/qapi/gen.py | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
> index ba32f776e6..cf340e66d4 100644
> --- a/scripts/qapi/gen.py
> +++ b/scripts/qapi/gen.py
> @@ -14,7 +14,6 @@
>  # See the COPYING file in the top-level directory.
>  
>  from contextlib import contextmanager
> -import errno
>  import os
>  import re
>  from typing import Dict, Generator, List, Optional, Tuple
> @@ -64,21 +63,18 @@ def write(self, output_dir: str) -> None:
>              return
>          pathname = os.path.join(output_dir, self.fname)
>          odir = os.path.dirname(pathname)
> +
>          if odir:
> -            try:
> -                os.makedirs(odir)
> -            except os.error as e:
> -                if e.errno != errno.EEXIST:
> -                    raise
> +            os.makedirs(odir, exist_ok=True)
> +
>          fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
> -        f = open(fd, 'r+', encoding='utf-8')
> -        text = self.get_content()
> -        oldtext = f.read(len(text) + 1)
> -        if text != oldtext:
> -            f.seek(0)
> -            f.truncate(0)
> -            f.write(text)
> -        f.close()
> +        with os.fdopen(fd, 'r+', encoding='utf-8') as fp:
> +            text = self.get_content()
> +            oldtext = fp.read(len(text) + 1)
> +            if text != oldtext:
> +                fp.seek(0)
> +                fp.truncate(0)
> +                fp.write(text)
>  
>  
>  def _wrap_ifcond(ifcond: List[str], before: str, after: str) -> str:
> -- 
> 2.26.2
>
John Snow Sept. 23, 2020, 6:37 p.m. UTC | #2
On 9/23/20 11:26 AM, Eduardo Habkost wrote:
> On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
>> Make the file handling here just a tiny bit more idiomatic.
>> (I realize this is heavily subjective.)
>>
>> Use exist_ok=True for os.makedirs and remove the exception,
>> use fdopen() to wrap the file descriptor in a File-like object,
>> and use a context manager for managing the file pointer.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
> 
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> 
> I really miss a comment below explaining why we use
> open(os.open(pathname, ...), ...) instead of open(pathname, ...).
> 

Not known to me. It was introduced in 907b846653 as part of an effort to 
reduce rebuild times. Maybe this avoids a modification time change if 
the file already exists?

Markus?
Cleber Rosa Sept. 24, 2020, 3:59 p.m. UTC | #3
On Wed, Sep 23, 2020 at 02:37:27PM -0400, John Snow wrote:
> On 9/23/20 11:26 AM, Eduardo Habkost wrote:
> > On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
> > > Make the file handling here just a tiny bit more idiomatic.
> > > (I realize this is heavily subjective.)
> > > 
> > > Use exist_ok=True for os.makedirs and remove the exception,
> > > use fdopen() to wrap the file descriptor in a File-like object,
> > > and use a context manager for managing the file pointer.
> > > 
> > > Signed-off-by: John Snow <jsnow@redhat.com>
> > 
> > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> > 
> > I really miss a comment below explaining why we use
> > open(os.open(pathname, ...), ...) instead of open(pathname, ...).
> > 
> 
> Not known to me. It was introduced in 907b846653 as part of an effort to
> reduce rebuild times. Maybe this avoids a modification time change if the
> file already exists?
> 
> Markus?

AFACIT the change on 907b846653 is effective because of the "is new
text different from old text?" conditional.  I can not see how the
separate/duplicate open/fdopen would contribute to that.

But, let's hear from Markus.

- Cleber.
Cleber Rosa Sept. 24, 2020, 4:01 p.m. UTC | #4
On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
> Make the file handling here just a tiny bit more idiomatic.
> (I realize this is heavily subjective.)
> 
> Use exist_ok=True for os.makedirs and remove the exception,
> use fdopen() to wrap the file descriptor in a File-like object,
> and use a context manager for managing the file pointer.
> 
> Signed-off-by: John Snow <jsnow@redhat.com>

Reviewed-by: Cleber Rosa <crosa@redhat.com>
Markus Armbruster Sept. 25, 2020, 1:15 p.m. UTC | #5
Cleber Rosa <crosa@redhat.com> writes:

> On Wed, Sep 23, 2020 at 02:37:27PM -0400, John Snow wrote:
>> On 9/23/20 11:26 AM, Eduardo Habkost wrote:
>> > On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
>> > > Make the file handling here just a tiny bit more idiomatic.
>> > > (I realize this is heavily subjective.)
>> > > 
>> > > Use exist_ok=True for os.makedirs and remove the exception,
>> > > use fdopen() to wrap the file descriptor in a File-like object,
>> > > and use a context manager for managing the file pointer.
>> > > 
>> > > Signed-off-by: John Snow <jsnow@redhat.com>
>> > 
>> > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>> > 
>> > I really miss a comment below explaining why we use
>> > open(os.open(pathname, ...), ...) instead of open(pathname, ...).

This code:

        fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
        f = open(fd, 'r+', encoding='utf-8')

>> Not known to me. It was introduced in 907b846653 as part of an effort to
>> reduce rebuild times. Maybe this avoids a modification time change if the
>> file already exists?
>> 
>> Markus?
>
> AFACIT the change on 907b846653 is effective because of the "is new
> text different from old text?" conditional.  I can not see how the
> separate/duplicate open/fdopen would contribute to that.
>
> But, let's hear from Markus.

This was my best attempt to open the file read/write, creating it if it
doesn't exist.

Plain

        f = open(pathname, "r+", encoding='utf-8')

fails instead of creates, and

        f = open(pathname, "w+", encoding='utf-8')

truncates.

If you know a better way, tell me!
Daniel P. Berrangé Sept. 25, 2020, 1:24 p.m. UTC | #6
On Fri, Sep 25, 2020 at 03:15:57PM +0200, Markus Armbruster wrote:
> Cleber Rosa <crosa@redhat.com> writes:
> 
> > On Wed, Sep 23, 2020 at 02:37:27PM -0400, John Snow wrote:
> >> On 9/23/20 11:26 AM, Eduardo Habkost wrote:
> >> > On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
> >> > > Make the file handling here just a tiny bit more idiomatic.
> >> > > (I realize this is heavily subjective.)
> >> > > 
> >> > > Use exist_ok=True for os.makedirs and remove the exception,
> >> > > use fdopen() to wrap the file descriptor in a File-like object,
> >> > > and use a context manager for managing the file pointer.
> >> > > 
> >> > > Signed-off-by: John Snow <jsnow@redhat.com>
> >> > 
> >> > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> >> > 
> >> > I really miss a comment below explaining why we use
> >> > open(os.open(pathname, ...), ...) instead of open(pathname, ...).
> 
> This code:
> 
>         fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
>         f = open(fd, 'r+', encoding='utf-8')
> 
> >> Not known to me. It was introduced in 907b846653 as part of an effort to
> >> reduce rebuild times. Maybe this avoids a modification time change if the
> >> file already exists?
> >> 
> >> Markus?
> >
> > AFACIT the change on 907b846653 is effective because of the "is new
> > text different from old text?" conditional.  I can not see how the
> > separate/duplicate open/fdopen would contribute to that.
> >
> > But, let's hear from Markus.
> 
> This was my best attempt to open the file read/write, creating it if it
> doesn't exist.
> 
> Plain
> 
>         f = open(pathname, "r+", encoding='utf-8')
> 
> fails instead of creates, and
> 
>         f = open(pathname, "w+", encoding='utf-8')
> 
> truncates.
> 
> If you know a better way, tell me!

IIUC, you need  "a+" as the mode, rather than "w+"

Regards,
Daniel
Eduardo Habkost Sept. 25, 2020, 1:26 p.m. UTC | #7
On Fri, Sep 25, 2020 at 03:15:57PM +0200, Markus Armbruster wrote:
> Cleber Rosa <crosa@redhat.com> writes:
> 
> > On Wed, Sep 23, 2020 at 02:37:27PM -0400, John Snow wrote:
> >> On 9/23/20 11:26 AM, Eduardo Habkost wrote:
> >> > On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
> >> > > Make the file handling here just a tiny bit more idiomatic.
> >> > > (I realize this is heavily subjective.)
> >> > > 
> >> > > Use exist_ok=True for os.makedirs and remove the exception,
> >> > > use fdopen() to wrap the file descriptor in a File-like object,
> >> > > and use a context manager for managing the file pointer.
> >> > > 
> >> > > Signed-off-by: John Snow <jsnow@redhat.com>
> >> > 
> >> > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> >> > 
> >> > I really miss a comment below explaining why we use
> >> > open(os.open(pathname, ...), ...) instead of open(pathname, ...).
> 
> This code:
> 
>         fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
>         f = open(fd, 'r+', encoding='utf-8')
> 
> >> Not known to me. It was introduced in 907b846653 as part of an effort to
> >> reduce rebuild times. Maybe this avoids a modification time change if the
> >> file already exists?
> >> 
> >> Markus?
> >
> > AFACIT the change on 907b846653 is effective because of the "is new
> > text different from old text?" conditional.  I can not see how the
> > separate/duplicate open/fdopen would contribute to that.
> >
> > But, let's hear from Markus.
> 
> This was my best attempt to open the file read/write, creating it if it
> doesn't exist.
> 
> Plain
> 
>         f = open(pathname, "r+", encoding='utf-8')
> 
> fails instead of creates, and
> 
>         f = open(pathname, "w+", encoding='utf-8')
> 
> truncates.
> 
> If you know a better way, tell me!

Thanks for the explanation!

Yeah, it looks like there's no combination of open() flags that
would get translated to O_RDWR|O_CREAT.

Using os.open() like you did seems more straightforward than
catching FileNotFoundError.
Eric Blake Sept. 25, 2020, 1:34 p.m. UTC | #8
On 9/25/20 8:24 AM, Daniel P. Berrangé wrote:

>> This code:
>>
>>          fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
>>          f = open(fd, 'r+', encoding='utf-8')
>>

>> This was my best attempt to open the file read/write, creating it if it
>> doesn't exist.
>>
>> Plain
>>
>>          f = open(pathname, "r+", encoding='utf-8')
>>
>> fails instead of creates, and

Checking what POSIX says for fopen():
https://pubs.opengroup.org/onlinepubs/9699919799/functions/fopen.html

Yep, "r+" does not use O_CREAT.

>>
>>          f = open(pathname, "w+", encoding='utf-8')
>>
>> truncates.

Yep, "w+" uses O_TRUNC.

>>
>> If you know a better way, tell me!
> 
> IIUC, you need  "a+" as the mode, rather than "w+"

That uses O_APPEND.  Which is fine if you are only appending and not 
touching existing contents, but not what you want if you are doing 
random access.

Yeah, the fopen() interface is rather puny, in that it does not express 
as many modes as open() supports.
Markus Armbruster Sept. 25, 2020, 1:52 p.m. UTC | #9
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Sep 25, 2020 at 03:15:57PM +0200, Markus Armbruster wrote:
>> Cleber Rosa <crosa@redhat.com> writes:
>> 
>> > On Wed, Sep 23, 2020 at 02:37:27PM -0400, John Snow wrote:
>> >> On 9/23/20 11:26 AM, Eduardo Habkost wrote:
>> >> > On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
>> >> > > Make the file handling here just a tiny bit more idiomatic.
>> >> > > (I realize this is heavily subjective.)
>> >> > > 
>> >> > > Use exist_ok=True for os.makedirs and remove the exception,
>> >> > > use fdopen() to wrap the file descriptor in a File-like object,
>> >> > > and use a context manager for managing the file pointer.
>> >> > > 
>> >> > > Signed-off-by: John Snow <jsnow@redhat.com>
>> >> > 
>> >> > Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>> >> > 
>> >> > I really miss a comment below explaining why we use
>> >> > open(os.open(pathname, ...), ...) instead of open(pathname, ...).
>> 
>> This code:
>> 
>>         fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
>>         f = open(fd, 'r+', encoding='utf-8')
>> 
>> >> Not known to me. It was introduced in 907b846653 as part of an effort to
>> >> reduce rebuild times. Maybe this avoids a modification time change if the
>> >> file already exists?
>> >> 
>> >> Markus?
>> >
>> > AFACIT the change on 907b846653 is effective because of the "is new
>> > text different from old text?" conditional.  I can not see how the
>> > separate/duplicate open/fdopen would contribute to that.
>> >
>> > But, let's hear from Markus.
>> 
>> This was my best attempt to open the file read/write, creating it if it
>> doesn't exist.
>> 
>> Plain
>> 
>>         f = open(pathname, "r+", encoding='utf-8')
>> 
>> fails instead of creates, and
>> 
>>         f = open(pathname, "w+", encoding='utf-8')
>> 
>> truncates.
>> 
>> If you know a better way, tell me!
>
> IIUC, you need  "a+" as the mode, rather than "w+"

Sure this lets me do

            f.seek(0)
            f.truncate(0)
            f.write(text)

to overwrite the old contents on all systems?

Documentation cautions:

    [...] 'a' for appending (which on some Unix systems, means that all
    writes append to the end of the file regardless of the current seek
    position).
Eric Blake Sept. 25, 2020, 3:47 p.m. UTC | #10
On 9/25/20 8:52 AM, Markus Armbruster wrote:

>>> This was my best attempt to open the file read/write, creating it if it
>>> doesn't exist.
>>>
>>> Plain
>>>
>>>          f = open(pathname, "r+", encoding='utf-8')
>>>
>>> fails instead of creates, and
>>>
>>>          f = open(pathname, "w+", encoding='utf-8')
>>>
>>> truncates.
>>>
>>> If you know a better way, tell me!
>>
>> IIUC, you need  "a+" as the mode, rather than "w+"
> 
> Sure this lets me do
> 
>              f.seek(0)
>              f.truncate(0)
>              f.write(text)
> 
> to overwrite the old contents on all systems?

As long as you do a single pass over the output (you issue a stream of 
f.write() after the truncate, but never a seek), then this will work.

> 
> Documentation cautions:
> 
>      [...] 'a' for appending (which on some Unix systems, means that all
>      writes append to the end of the file regardless of the current seek
>      position).

Yes, that means that random access is impossible on such a stream.  But 
not all file creation patterns require random access.
John Snow Sept. 25, 2020, 4:33 p.m. UTC | #11
On 9/25/20 9:26 AM, Eduardo Habkost wrote:
> On Fri, Sep 25, 2020 at 03:15:57PM +0200, Markus Armbruster wrote:
>> Cleber Rosa <crosa@redhat.com> writes:
>>
>>> On Wed, Sep 23, 2020 at 02:37:27PM -0400, John Snow wrote:
>>>> On 9/23/20 11:26 AM, Eduardo Habkost wrote:
>>>>> On Tue, Sep 22, 2020 at 05:00:51PM -0400, John Snow wrote:
>>>>>> Make the file handling here just a tiny bit more idiomatic.
>>>>>> (I realize this is heavily subjective.)
>>>>>>
>>>>>> Use exist_ok=True for os.makedirs and remove the exception,
>>>>>> use fdopen() to wrap the file descriptor in a File-like object,
>>>>>> and use a context manager for managing the file pointer.
>>>>>>
>>>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>>>
>>>>> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>>>>>
>>>>> I really miss a comment below explaining why we use
>>>>> open(os.open(pathname, ...), ...) instead of open(pathname, ...).
>>
>> This code:
>>
>>          fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
>>          f = open(fd, 'r+', encoding='utf-8')
>>
>>>> Not known to me. It was introduced in 907b846653 as part of an effort to
>>>> reduce rebuild times. Maybe this avoids a modification time change if the
>>>> file already exists?
>>>>
>>>> Markus?
>>>
>>> AFACIT the change on 907b846653 is effective because of the "is new
>>> text different from old text?" conditional.  I can not see how the
>>> separate/duplicate open/fdopen would contribute to that.
>>>
>>> But, let's hear from Markus.
>>
>> This was my best attempt to open the file read/write, creating it if it
>> doesn't exist.
>>
>> Plain
>>
>>          f = open(pathname, "r+", encoding='utf-8')
>>
>> fails instead of creates, and
>>
>>          f = open(pathname, "w+", encoding='utf-8')
>>
>> truncates.
>>
>> If you know a better way, tell me!
> 
> Thanks for the explanation!
> 
> Yeah, it looks like there's no combination of open() flags that
> would get translated to O_RDWR|O_CREAT.
> 
> Using os.open() like you did seems more straightforward than
> catching FileNotFoundError.
> 

OK. Using fdopen works for us, so let's stick with it. I will add a 
comment explaining our reasoning.

Thanks!

--js
Markus Armbruster Sept. 28, 2020, 12:09 p.m. UTC | #12
Eric Blake <eblake@redhat.com> writes:

> On 9/25/20 8:52 AM, Markus Armbruster wrote:
>
>>>> This was my best attempt to open the file read/write, creating it if it
>>>> doesn't exist.
>>>>
>>>> Plain
>>>>
>>>>          f = open(pathname, "r+", encoding='utf-8')
>>>>
>>>> fails instead of creates, and
>>>>
>>>>          f = open(pathname, "w+", encoding='utf-8')
>>>>
>>>> truncates.
>>>>
>>>> If you know a better way, tell me!
>>>
>>> IIUC, you need  "a+" as the mode, rather than "w+"
>> Sure this lets me do
>>              f.seek(0)
>>              f.truncate(0)
>>              f.write(text)
>> to overwrite the old contents on all systems?
>
> As long as you do a single pass over the output (you issue a stream of
> f.write() after the truncate, but never a seek), then this will work.

Well, I do seek(), right before the truncate.

>> Documentation cautions:
>>      [...] 'a' for appending (which on some Unix systems, means that
>> all
>>      writes append to the end of the file regardless of the current seek
>>      position).
>
> Yes, that means that random access is impossible on such a stream.
> But not all file creation patterns require random access.

To be honest, I still prefer the code I wrote, because there the reader
only wonders why I didn't just open(), while here we get to argue about
subtleties of mode "a+".
John Snow Sept. 28, 2020, 2:08 p.m. UTC | #13
On 9/28/20 8:09 AM, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
>> On 9/25/20 8:52 AM, Markus Armbruster wrote:
>>
>>>>> This was my best attempt to open the file read/write, creating it if it
>>>>> doesn't exist.
>>>>>
>>>>> Plain
>>>>>
>>>>>           f = open(pathname, "r+", encoding='utf-8')
>>>>>
>>>>> fails instead of creates, and
>>>>>
>>>>>           f = open(pathname, "w+", encoding='utf-8')
>>>>>
>>>>> truncates.
>>>>>
>>>>> If you know a better way, tell me!
>>>>
>>>> IIUC, you need  "a+" as the mode, rather than "w+"
>>> Sure this lets me do
>>>               f.seek(0)
>>>               f.truncate(0)
>>>               f.write(text)
>>> to overwrite the old contents on all systems?
>>
>> As long as you do a single pass over the output (you issue a stream of
>> f.write() after the truncate, but never a seek), then this will work.
> 
> Well, I do seek(), right before the truncate.
> 
>>> Documentation cautions:
>>>       [...] 'a' for appending (which on some Unix systems, means that
>>> all
>>>       writes append to the end of the file regardless of the current seek
>>>       position).
>>
>> Yes, that means that random access is impossible on such a stream.
>> But not all file creation patterns require random access.
> 
> To be honest, I still prefer the code I wrote, because there the reader
> only wonders why I didn't just open(), while here we get to argue about
> subtleties of mode "a+".
> 

I kept your os.open, I agree with you here.

(I still rewrote to use the context managers, though.)
diff mbox series

Patch

diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
index ba32f776e6..cf340e66d4 100644
--- a/scripts/qapi/gen.py
+++ b/scripts/qapi/gen.py
@@ -14,7 +14,6 @@ 
 # See the COPYING file in the top-level directory.
 
 from contextlib import contextmanager
-import errno
 import os
 import re
 from typing import Dict, Generator, List, Optional, Tuple
@@ -64,21 +63,18 @@  def write(self, output_dir: str) -> None:
             return
         pathname = os.path.join(output_dir, self.fname)
         odir = os.path.dirname(pathname)
+
         if odir:
-            try:
-                os.makedirs(odir)
-            except os.error as e:
-                if e.errno != errno.EEXIST:
-                    raise
+            os.makedirs(odir, exist_ok=True)
+
         fd = os.open(pathname, os.O_RDWR | os.O_CREAT, 0o666)
-        f = open(fd, 'r+', encoding='utf-8')
-        text = self.get_content()
-        oldtext = f.read(len(text) + 1)
-        if text != oldtext:
-            f.seek(0)
-            f.truncate(0)
-            f.write(text)
-        f.close()
+        with os.fdopen(fd, 'r+', encoding='utf-8') as fp:
+            text = self.get_content()
+            oldtext = fp.read(len(text) + 1)
+            if text != oldtext:
+                fp.seek(0)
+                fp.truncate(0)
+                fp.write(text)
 
 
 def _wrap_ifcond(ifcond: List[str], before: str, after: str) -> str: