pyopencl initialize non zero float4 array-CodePudding

I want to populate an array with float4 type. I have no idea how to initialize the arrays with something else than zeros. I've tried variations of this, but this is what I've come with, that explains what I want to do:

import pyopencl as cl
import numpy as np

kernelSource = """
__kernel void addOneToFloat4(__global float4 *a)
{
    int gid = get_global_id(0);
        a[gid]  = 1.0f;
}
"""

context = cl.create_some_context()
queue = cl.CommandQueue(context)
device = context.devices[0]
program = cl.Program(context, kernelSource).build()

N = 10
HOST_array = np.array([[1, 0, 0, 0]]*N, dtype=cl.cltypes.float4)
TARGET_array = cl.Buffer(context, cl.mem_flags.READ_WRITE | cl.mem_flags.COPY_HOST_PTR, hostbuf=HOST_array)
cl.enqueue_copy(queue, dest=TARGET_array, src=HOST_array)

program.addOneToFloat4(queue, (N,), None, TARGET_array)

cl.enqueue_copy(queue, dest=HOST_array, src=TARGET_array)
queue.finish()

print(HOST_array)

of course it doesn't work, because it understands the input with spahe (N, 4), but since float4 is just a type, it requires (N, ) size.

I've seen people initialize with np.zeros(N, dtype=float4), but I don't want to initialize to 0.

I find very few practical examples for pyopencl, and the documentation doesn't always help, it doesn't even mention float3 or float4.

If we look at the OpenCL documentation, we can see that the type float4 is a struct which has .x, .y, .z, .w as its fields. It is also declared as a type, so I expect to be able to use it like any other type.

CodePudding user response：

Without the cl stuff, here's a sample array creation

In [50]: myStruct = np.dtype(
    ...:         [("position", float),
    ...:          ("direction", float),
    ...:          ("er", float),
    ...:          ("weight", float)])

In [55]: arr = np.zeros(5,myStruct)
In [56]: arr
Out[56]: 
array([(0., 0., 0., 0.), (0., 0., 0., 0.), (0., 0., 0., 0.),
       (0., 0., 0., 0.), (0., 0., 0., 0.)],
      dtype=[('position', '<f8'), ('direction', '<f8'), ('er', '<f8'), ('weight', '<f8')])
In [57]: arr['position']
Out[57]: array([0., 0., 0., 0., 0.])
In [58]: arr['position']=np.arange(5)
In [59]: arr['direction']=3
In [60]: arr['er']=np.array([[1,2,3,4,5]])
In [61]: arr['weight']=np.array([[1,2,3,4,5]]).T
Traceback (most recent call last):
  Input In [61] in <cell line: 1>
    arr['weight']=np.array([[1,2,3,4,5]]).T
ValueError: could not broadcast input array from shape (5,1) into shape (5,)

CodePudding user response：

after searching the source code of pyopencl, I figured the problem was due to the functions generated on runtime. Also, it is not explicit in the documentation that those functions were available. So to load an array into a type<n> you need to call the function cl.cltypes.make_type<n> and set the type to cl.cltypes.type<n>. Because this is generated at runtime, it will not be in the namespace, so your ide will not recognize them.

myFloat4 = cl.cltypes.make_float4(0,1,1,0)
myArrayFloat4 = np.array([myFloat4], dtype=cl.cltypes.float4)

So, for completeness, here's my fix:

import pyopencl as cl
import numpy as np

kernelSource = """
__kernel void addOneToFloat4(__global float4 *a)
{
    int gid = get_global_id(0);
        a[gid]  = 1.0f;
}
"""

context = cl.create_some_context()
queue = cl.CommandQueue(context)
device = context.devices[0]
program = cl.Program(context, kernelSource).build()

N = 10
HOST_array = np.array([cl.cltypes.make_float4(1, 0, 0, 0)]*N, dtype=cl.cltypes.float4)
TARGET_array = cl.Buffer(context, cl.mem_flags.READ_WRITE | cl.mem_flags.COPY_HOST_PTR, hostbuf=HOST_array)
cl.enqueue_copy(queue, dest=TARGET_array, src=HOST_array)

program.addOneToFloat4(queue, (N,), None, TARGET_array)

cl.enqueue_copy(queue, dest=HOST_array, src=TARGET_array)
queue.finish()

print(HOST_array)