Skip to content

PooledBuffer and one other minor fix #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

SyamGadde
Copy link
Contributor

Another fix to allow arrays backed by PooledBuffers to be used with offsets.
Also, there's no reason someone shouldn't be able to use different slices of the same array as input and output, so also check if the offset matches to determine whether it is an in-place transform -- does not check that the two slices don't overlap in some way, a general solution to which would be complicated.

@zachjweiner
Copy link

zachjweiner commented Mar 22, 2020

Hi @SyamGadde - sorry to drag up an old PR, but I would love to get gpyfft working for PooledBuffers with offsets. Unfortunately, your proposed fix here no longer seems to work: PooledBuffers do not (currently) have a buf attribute. I'm struggling to see how to go about an alternative fix - do you happen to have any ideas?

@geggo
Copy link
Owner

geggo commented Mar 23, 2020

Do you have a short example that shows what you would like to achieve but which fails? I am using gpyfft with memory pool allocated arrays, so I don‘t see the problem.

Check for offsets in this PR is useful, will add this.

Gregor

@zachjweiner
Copy link

I'll share one in a little bit - but the issue is using sliced pool-allocated arrays (i.e., with offsets) - in which case the attempt to slice and pass a 0-offset array to the transform fails.

@zachjweiner
Copy link

Here's a reproducer:

import pyopencl as cl
import pyopencl.array as cla
import numpy as np

context = cl.create_some_context()
queue = cl.CommandQueue(context)
pool = cl.tools.MemoryPool(cl.tools.ImmediateAllocator(queue))

a = np.random.rand(256, 256, 256)
a_gpu = cla.to_device(queue, a)
b_gpu = cla.zeros(queue, (256, 256, 129), 'complex128')

from gpyfft import FFT
transform = FFT(context, queue, a_gpu, out_array=b_gpu, real=True)

a_pooled = cla.to_device(queue, np.random.rand(3, 256, 256, 256), allocator=pool)

# succeeds because no offset
event, = transform.enqueue_arrays(data=a_pooled[0], result=b_gpu, forward=True)
event.wait()

# fails with offset
event, = transform.enqueue_arrays(data=a_pooled[1], result=b_gpu, forward=True)
event.wait()

@geggo
Copy link
Owner

geggo commented Mar 24, 2020

Thanks,
it seems the problem is rooted in the fact that a PooledBuffer does not expose the same interface as a Buffer. In particular it is missing get_sub_region and getitem, needed to slice a subarray which is needed for nonzero offset. clFFT, on the other hand, expects that data starts at offset=0.

Looking at pyopencl sources (wrap_mempool.cpp) I see that PooledBuffer is derived from memory_object_holder, not buffer. This might be an oversight, but I don't know. You might ping Andreas Klöckner, author of pyopencl.

Alternatively as a workaround, the gpyfft wrapper could temporarily create sub buffers via OpenCL low level calls if it encounters nonzero offset, but this should be better handled by pyopencl.

best
Gregor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants