[Bug,100105] Make Theano OpenCL support work on Clover and RadeonSI
diff mbox

Message ID bug-100105-502-U9wQqF0ebd@http.bugs.freedesktop.org/
State New
Headers show

Commit Message

bugzilla-daemon@freedesktop.org April 4, 2018, 11:37 p.m. UTC
https://bugs.freedesktop.org/show_bug.cgi?id=100105

--- Comment #2 from Jan Vesely <jan.vesely@rutgers.edu> ---
Latest update:

>>> pygpu.test()
pygpu is installed in
/home/jvesely/.local/lib/python3.6/site-packages/pygpu-0.7.5+12.g6f0132c.dirty-py3.6-linux-x86_64.egg/pygpu
NumPy version 1.13.3
NumPy relaxed strides checking option: True
NumPy is installed in /usr/lib64/python3.6/site-packages/numpy
Python version 3.6.4 (default, Mar 13 2018, 18:18:20) [GCC 7.3.1 20180303 (Red
Hat 7.3.1-5)]
nose version 1.3.7
*** Testing for AMD Radeon R7 Graphics (CARRIZO / DRM 3.23.0 /
4.15.14-300.fc27.x86_64, LLVM 6.0.0)

----------------------------------------------------------------------
Ran 6670 tests in 995.728s

FAILED (SKIP=12, errors=580, failures=2)

All errors are: TypeError: This is for CUDA arrays.
The two failures are:
FAIL: pygpu.tests.test_elemwise.test_elemwise_f16(<built-in function add>,
'float16', 'float16', (50,))
FAIL: pygpu.tests.test_elemwise.test_elemwise_f16(<built-in function iadd>,
'float16', 'float16', (50,))

Which fail on half precision rounding error. for example:
7.0390625+7.20703125 is expected to be 14.25 but gpu returns 14.2421875
the fp32 result is 14.24609375.

The GPU result is rounded down (towards zero)
The CPU result is rounded up (away from zero)

It looks like our vstore_half_rtn is not working as expected, which is weird
because it passes CTS.

Patch
diff mbox

diff --git a/src/cluda_opencl.h b/src/cluda_opencl.h
index 6e0095c..8ba2d14 100644
--- a/src/cluda_opencl.h
+++ b/src/cluda_opencl.h
@@ -48,7 +48,7 @@  typedef struct _ga_half {
 } ga_half;

 #define ga_half2float(p) vload_half(0, &((p).data))
-static inline ga_half ga_float2half(ga_float f) {
+inline ga_half ga_float2half(ga_float f) {
   ga_half r;
   vstore_half_rtn(f, 0, &r.data);
   return r;
diff --git a/src/gpuarray_buffer_opencl.c b/src/gpuarray_buffer_opencl.c
index 8f12811..2041ca2 100644
--- a/src/gpuarray_buffer_opencl.c
+++ b/src/gpuarray_buffer_opencl.c
@@ -146,7 +146,7 @@  cl_ctx *cl_make_ctx(cl_context ctx, gpucontext_props *p) {
   CL_CHECKN(global_err, clGetDeviceInfo(id, CL_DEVICE_VERSION,
                                         device_version_size,
                                         device_version, NULL));
-  if (device_version[7] == '1' && device_version[9] < '2') {
+  if (device_version[7] == '1' && device_version[9] < '1') {
     error_set(global_err, GA_UNSUPPORTED_ERROR,
               "We only support OpenCL 1.2 and up");
     return NULL