Inquiry about the implementation feasibility of a FFT-based algorithm using this lib #46
Unanswered
Vandermode
asked this question in
Q&A
Replies: 1 comment 1 reply
-
|
Hi @Vandermode |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Dear Developer @markjolah ,
Thank you very much for developing this fantastic lib!
I am wondering if I can leverage this lib to implement a FFT-based algorithm for scientific computing purpose.
The algorithm of interest basically can be viewed as an iterative execution of a sequence of FFT-based 2D convolutions (as follows)
for each iteration, we perform two FFT-based convolutions (with two fixed kernels) on the input, and weighted combine the results to get the input for next iteration.
For real-world task, hundreds of thousand iterations might be needed to finish one round, making the implementation efficiency especially matters.
Currently, I am using the high-performant VkFFT lib's fused convolution kernel (https://github.com/DTolm/VkFFT) which is more efficient than the seperate CuFFT calls.
I am curious if we can implement the whole iterative algorithm as a single kernel using this lib (rather than just fusing the convolution op)? That's said, is there any space I can exploit to further accelerate this implementation?
Thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions