diff mbox

Python 3 bindings

Message ID 20170221130300.GF1146@mail-itl (mailing list archive)
State New, archived
Headers show

Commit Message

Marek Marczykowski-Górecki Feb. 21, 2017, 1:03 p.m. UTC
On Mon, Feb 20, 2017 at 05:18:44PM +0000, Wei Liu wrote:
> On Fri, Feb 17, 2017 at 01:36:01PM +0100, Marek Marczykowski-Górecki wrote:
> > Hi,
> > 
> > I'm adjusting python bindings to work on python3 too. This will require
> > few #if in the code (to compile for both python2 and python3), but it
> > isn't that bad. But there are some major changes in python3, which
> > require some decision about the bindings API:
> > 
> > 1. Python3 has no longer separate 'int' and 'long' type - old 'long'
> > type was renamed to 'int' (but on C-API level, it uses PyLong_*). I see
> > two options:
> >   - switch to PyLong_* everywhere, including python2 bindings - this
> >     makes the code much cleaner, but it is an API change in python2
> >   - switch to PyLong_* only for python3 - this will introduce some
> >     #ifdefs, but python2 API will be unchanged
> 
> Could you be more specific? Like, provide a code snippet?

Here is compile tested only version:
https://github.com/marmarek/xen/tree/python3

It uses PyLong_* only for python3, here is how it looks in code (I've
skipped s/PyInt_/PyLongOrInt_/ for readability):

-----8<-----
-----8<-----

> 
> > 
> > 2. Python3 has no longer separate 'str' and 'unicode' type, new 'str' is
> > the same as 'unicode' (PyUnicode_* at C-API level). For things not
> > really unicode-aware, 'bytes' type should be used. On the other hand, in
> > python2 'bytes' type was the same as 'str'.
> > This affects various places, where in most cases 'bytes' type is
> > appropriate (for example cpuid). But I'm not sure about xenstore paths -
> > those should also be 'bytes', or maybe 'unicode' (which is implicitly
> > using 'utf-8' encoding)? I think the only reason to use 'unicode' is
> 
> According to docs/txt/misc/xenstore.txt, paths should be ASCII
> alphanumerics plus four punctuation characters. Not sure if this is
> relevant to what you describe.

It's easy to make function accept both 'bytes' and 'unicode'. The
question is what should be return type (read_watch, ls etc) - given
limited character set used there, I'm in favor of 'unicode' - easier to
handle, but we shouldn't hit any unicode decoding problems.
Maybe the same should apply to path arguments (use 'unicode')? Most
file-handling methods in python3 use 'unicode' for paths, if that
matters.

> > convenience for API users - in python3 if you write 'some string' it
> > will be unicode type, to create bytes data you need to write b'some
> > string'.
> > As for python2, it should definitely be still 'str'/'bytes' type.
> > 
> > There is one more little detail - build process. Here I'm going to
> > follow popular standard - use $(PYTHON) variable - if that points to
> > python3, build for python3. Actually this means no change in the current
> > makefile. If someone want to build for both python2 and python3, will
> > need to call the build twice - at packaging level.

Comments

Wei Liu Feb. 22, 2017, 11:34 a.m. UTC | #1
On Tue, Feb 21, 2017 at 02:03:00PM +0100, Marek Marczykowski-Górecki wrote:
> On Mon, Feb 20, 2017 at 05:18:44PM +0000, Wei Liu wrote:
> > On Fri, Feb 17, 2017 at 01:36:01PM +0100, Marek Marczykowski-Górecki wrote:
> > > Hi,
> > > 
> > > I'm adjusting python bindings to work on python3 too. This will require
> > > few #if in the code (to compile for both python2 and python3), but it
> > > isn't that bad. But there are some major changes in python3, which
> > > require some decision about the bindings API:
> > > 
> > > 1. Python3 has no longer separate 'int' and 'long' type - old 'long'
> > > type was renamed to 'int' (but on C-API level, it uses PyLong_*). I see
> > > two options:
> > >   - switch to PyLong_* everywhere, including python2 bindings - this
> > >     makes the code much cleaner, but it is an API change in python2
> > >   - switch to PyLong_* only for python3 - this will introduce some
> > >     #ifdefs, but python2 API will be unchanged
> > 
> > Could you be more specific? Like, provide a code snippet?
> 
> Here is compile tested only version:
> https://github.com/marmarek/xen/tree/python3
> 
> It uses PyLong_* only for python3, here is how it looks in code (I've
> skipped s/PyInt_/PyLongOrInt_/ for readability):
> 
> -----8<-----
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -34,6 +34,17 @@
>  
>  #define FLASK_CTX_LEN 1024
>  
> +/* Python 2 compatibility */
> +#if PY_VERSION_HEX >= 0x03000000
> +#define PyLongOrInt_FromLong PyLong_FromLong
> +#define PyLongOrInt_Check PyLong_Check
> +#define PyLongOrInt_AsLong PyLong_AsLong
> +#else
> +#define PyLongOrInt_FromLong PyInt_FromLong
> +#define PyLongOrInt_Check PyInt_Check
> +#define PyLongOrInt_AsLong PyInt_AsLong
> +#endif
> +
>  static PyObject *xc_error_obj, *zero;
>  
>  typedef struct {
> --- a/tools/python/xen/lowlevel/xs/xs.c
> +++ b/tools/python/xen/lowlevel/xs/xs.c
> @@ -43,6 +43,14 @@
>  #define PKG "xen.lowlevel.xs"
>  #define CLS "xs"
>  
> +#if PY_VERSION_HEX < 0x03000000
> +/* Python 2 compatibility */
> +#define PyLong_FromLong PyInt_FromLong
> +#undef PyLong_Check
> +#define PyLong_Check PyInt_Check
> +#define PyLong_AsLong PyInt_AsLong
> +#endif
> +
>  static PyObject *xs_error;
>  

If this is the recommended practice, then that's fine. I can't seem to
find such practice in https://docs.python.org/3/howto/cporting.html but
I'm no python binding expert.

BTW, I went through your python3 branch. It seems that some patches can
be submitted independently.


>  /** Python wrapper round an xs handle.
> -----8<-----
> 
> > 
> > > 
> > > 2. Python3 has no longer separate 'str' and 'unicode' type, new 'str' is
> > > the same as 'unicode' (PyUnicode_* at C-API level). For things not
> > > really unicode-aware, 'bytes' type should be used. On the other hand, in
> > > python2 'bytes' type was the same as 'str'.
> > > This affects various places, where in most cases 'bytes' type is
> > > appropriate (for example cpuid). But I'm not sure about xenstore paths -
> > > those should also be 'bytes', or maybe 'unicode' (which is implicitly
> > > using 'utf-8' encoding)? I think the only reason to use 'unicode' is
> > 
> > According to docs/txt/misc/xenstore.txt, paths should be ASCII
> > alphanumerics plus four punctuation characters. Not sure if this is
> > relevant to what you describe.
> 
> It's easy to make function accept both 'bytes' and 'unicode'. The
> question is what should be return type (read_watch, ls etc) - given
> limited character set used there, I'm in favor of 'unicode' - easier to
> handle, but we shouldn't hit any unicode decoding problems.
> Maybe the same should apply to path arguments (use 'unicode')? Most
> file-handling methods in python3 use 'unicode' for paths, if that
> matters.
> 

OK. Using unicode makes sense to me. Again, I'm no python expert and I
trust what you said. :-)

> > > convenience for API users - in python3 if you write 'some string' it
> > > will be unicode type, to create bytes data you need to write b'some
> > > string'.
> > > As for python2, it should definitely be still 'str'/'bytes' type.
> > > 
> > > There is one more little detail - build process. Here I'm going to
> > > follow popular standard - use $(PYTHON) variable - if that points to
> > > python3, build for python3. Actually this means no change in the current
> > > makefile. If someone want to build for both python2 and python3, will
> > > need to call the build twice - at packaging level.
> 
> -- 
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
Marek Marczykowski-Górecki Feb. 22, 2017, 11:52 a.m. UTC | #2
On Wed, Feb 22, 2017 at 11:34:16AM +0000, Wei Liu wrote:
> On Tue, Feb 21, 2017 at 02:03:00PM +0100, Marek Marczykowski-Górecki wrote:
> > On Mon, Feb 20, 2017 at 05:18:44PM +0000, Wei Liu wrote:
> > > On Fri, Feb 17, 2017 at 01:36:01PM +0100, Marek Marczykowski-Górecki wrote:
> > > > Hi,
> > > > 
> > > > I'm adjusting python bindings to work on python3 too. This will require
> > > > few #if in the code (to compile for both python2 and python3), but it
> > > > isn't that bad. But there are some major changes in python3, which
> > > > require some decision about the bindings API:
> > > > 
> > > > 1. Python3 has no longer separate 'int' and 'long' type - old 'long'
> > > > type was renamed to 'int' (but on C-API level, it uses PyLong_*). I see
> > > > two options:
> > > >   - switch to PyLong_* everywhere, including python2 bindings - this
> > > >     makes the code much cleaner, but it is an API change in python2
> > > >   - switch to PyLong_* only for python3 - this will introduce some
> > > >     #ifdefs, but python2 API will be unchanged
> > > 
> > > Could you be more specific? Like, provide a code snippet?
> > 
> > Here is compile tested only version:
> > https://github.com/marmarek/xen/tree/python3
> > 
> > It uses PyLong_* only for python3, here is how it looks in code (I've
> > skipped s/PyInt_/PyLongOrInt_/ for readability):
> > 
> > -----8<-----
> > --- a/tools/python/xen/lowlevel/xc/xc.c
> > +++ b/tools/python/xen/lowlevel/xc/xc.c
> > @@ -34,6 +34,17 @@
> >  
> >  #define FLASK_CTX_LEN 1024
> >  
> > +/* Python 2 compatibility */
> > +#if PY_VERSION_HEX >= 0x03000000
> > +#define PyLongOrInt_FromLong PyLong_FromLong
> > +#define PyLongOrInt_Check PyLong_Check
> > +#define PyLongOrInt_AsLong PyLong_AsLong
> > +#else
> > +#define PyLongOrInt_FromLong PyInt_FromLong
> > +#define PyLongOrInt_Check PyInt_Check
> > +#define PyLongOrInt_AsLong PyInt_AsLong
> > +#endif
> > +
> >  static PyObject *xc_error_obj, *zero;
> >  
> >  typedef struct {
> > --- a/tools/python/xen/lowlevel/xs/xs.c
> > +++ b/tools/python/xen/lowlevel/xs/xs.c
> > @@ -43,6 +43,14 @@
> >  #define PKG "xen.lowlevel.xs"
> >  #define CLS "xs"
> >  
> > +#if PY_VERSION_HEX < 0x03000000
> > +/* Python 2 compatibility */
> > +#define PyLong_FromLong PyInt_FromLong
> > +#undef PyLong_Check
> > +#define PyLong_Check PyInt_Check
> > +#define PyLong_AsLong PyInt_AsLong
> > +#endif
> > +
> >  static PyObject *xs_error;
> >  
> 
> If this is the recommended practice, then that's fine. I can't seem to
> find such practice in https://docs.python.org/3/howto/cporting.html but
> I'm no python binding expert.

"In the C-API, PyInt_* functions are replaced by their PyLong_*
equivalents."

> BTW, I went through your python3 branch. It seems that some patches can
> be submitted independently.

Yes, I've tried to separate cleanup commits from actual python3 support.

> >  /** Python wrapper round an xs handle.
> > -----8<-----
> > 
> > > 
> > > > 
> > > > 2. Python3 has no longer separate 'str' and 'unicode' type, new 'str' is
> > > > the same as 'unicode' (PyUnicode_* at C-API level). For things not
> > > > really unicode-aware, 'bytes' type should be used. On the other hand, in
> > > > python2 'bytes' type was the same as 'str'.
> > > > This affects various places, where in most cases 'bytes' type is
> > > > appropriate (for example cpuid). But I'm not sure about xenstore paths -
> > > > those should also be 'bytes', or maybe 'unicode' (which is implicitly
> > > > using 'utf-8' encoding)? I think the only reason to use 'unicode' is
> > > 
> > > According to docs/txt/misc/xenstore.txt, paths should be ASCII
> > > alphanumerics plus four punctuation characters. Not sure if this is
> > > relevant to what you describe.
> > 
> > It's easy to make function accept both 'bytes' and 'unicode'. The
> > question is what should be return type (read_watch, ls etc) - given
> > limited character set used there, I'm in favor of 'unicode' - easier to
> > handle, but we shouldn't hit any unicode decoding problems.
> > Maybe the same should apply to path arguments (use 'unicode')? Most
> > file-handling methods in python3 use 'unicode' for paths, if that
> > matters.
> > 
> 
> OK. Using unicode makes sense to me. Again, I'm no python expert and I
> trust what you said. :-)

Ok, I'll adjust patches to return unicode paths.

Then test them and actually send here :)
diff mbox

Patch

--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -34,6 +34,17 @@ 
 
 #define FLASK_CTX_LEN 1024
 
+/* Python 2 compatibility */
+#if PY_VERSION_HEX >= 0x03000000
+#define PyLongOrInt_FromLong PyLong_FromLong
+#define PyLongOrInt_Check PyLong_Check
+#define PyLongOrInt_AsLong PyLong_AsLong
+#else
+#define PyLongOrInt_FromLong PyInt_FromLong
+#define PyLongOrInt_Check PyInt_Check
+#define PyLongOrInt_AsLong PyInt_AsLong
+#endif
+
 static PyObject *xc_error_obj, *zero;
 
 typedef struct {
--- a/tools/python/xen/lowlevel/xs/xs.c
+++ b/tools/python/xen/lowlevel/xs/xs.c
@@ -43,6 +43,14 @@ 
 #define PKG "xen.lowlevel.xs"
 #define CLS "xs"
 
+#if PY_VERSION_HEX < 0x03000000
+/* Python 2 compatibility */
+#define PyLong_FromLong PyInt_FromLong
+#undef PyLong_Check
+#define PyLong_Check PyInt_Check
+#define PyLong_AsLong PyInt_AsLong
+#endif
+
 static PyObject *xs_error;
 
 /** Python wrapper round an xs handle.