@@ -6,16 +6,16 @@ AMD EPYC CPUs. It is developed on top of FFTW (version fftw-3.3.10).
66All known features and functionalities of FFTW are retained and supported
77as it is with this AMD optimized FFTW library.
88
9- AOCL-FFTW achieves higher performance than the original FFTW 3.3.10 due to its
10- various optimizations involving improved SIMD Kernel functions, improved copy
11- functions (cpy2d and cpy2d_pair used in rank-0 transform and buffering plan),
9+ AOCL-FFTW achieves high performance as a result of its various optimizations
10+ involving improved SIMD Kernel functions, improved copy functions
11+ (cpy2d and cpy2d_pair used in rank-0 transform and buffering plan),
1212improved 256-bit kernels selection by Planner and an optional in-place
1313transpose for large problem sizes. AOCL-FFTW improves the performance
14- of in-place MPI FFTs over FFTW 3.3.10 by employing a faster in-place MPI
15- transpose function. AOCL-FFTW provides a new fast planner mode as an
16- extension to the original planner that improves planning time of various
17- planning modes in general and PATIENT mode in particular. Another new planning
18- mode called Top N planner is also available that minimizes single-threaded
14+ of in-place MPI FFTs by employing a faster in-place MPI transpose function.
15+ AOCL-FFTW provides a new fast planner mode as an extension to the original
16+ planner that improves planning time of various planning modes in general
17+ and PATIENT mode in particular. Another new planning mode called
18+ Top N planner is also available that minimizes single-threaded
1919run-to-run variations. AOCL-FFTW has a feature called AMD's application
2020optimization layer that speeds up HPC and scientific applications. AOCL-FFTW
2121implements the dynamic dispatcher feature that can build a single portable
@@ -45,7 +45,8 @@ generation architectures.
4545
4646 ./configure --enable-sse2 --enable-avx --enable-avx2 --enable-avx512
4747 --enable-mpi --enable-openmp --enable-shared
48- --enable-amd-opt --enable-amd-mpifft
48+ --enable-amd-opt --enable-amd-mpifft
49+ --enable-dynamic-dispatcher
4950 --prefix=<your-install-dir>
5051 make
5152 make install
@@ -85,7 +86,7 @@ problem types, Quad or Long double precisions, and split array format.
8586
8687Dynamic dispatcher achieves Function Multi-versioning by using compiler's
8788attributes. Use "--enable-dynamic-dispatcher" configure option to enable this
88- feature. It is supported for GCC compiler and Linux based systems for now.
89+ feature. It is supported for Linux based systems for now.
8990The set of x86 CPUs on which the single portable library can work depends upon
9091the highest level of CPU SIMD instruction set with which it is configured.
9192
@@ -101,9 +102,8 @@ CONTACTS
101102--------
102103
103104AOCL-FFTW is developed and maintained by AMD.
104- You can contact us on the email-id
[email protected] .
105- You can also raise any issue/suggestion on the git-hub repository at
106- https://github.com/amd/amd-fftw/issues
105+ For support of these libraries and the other tools of AMD Zen Software Studio,
106+ see https://www.amd.com/en/developer/aocc/compiler-technical-support.html
107107
108108ACKNOWLEDGEMENTS
109109----------------
0 commit comments