The Fast Fourier Transform (FFT) is a well known way to significantly reduce the number of computations needed for the Discrete Fourier Transform. Historically it has played a significant role in the creation of the field Digital Signal Processing. BitSim has during the years worked with different solutions for FFT cores. Some experiences are:
FFT cores with variable number of complex points (4-8192) in action, in other words without any requirement for a reboot or for a new configuration.
FFT cores for extremely high throughput. These FFT cores have had a translate time better than 25ns with 128-512 complex points.
special window functions in order to achieve desired performance and/or behaviour.
algorithms where FFT is a part of the system and the FFT has the possibility to put a zoom function in frequency domain. This makes it possible to rapidly get a deep analysis of the transformed data in combination with a rough frequency resolution.
Choice of radix size
The FPGA products of today can hold more and more butterfly's or radix blocks. Therfore you can have larger FFT's with the same through put. When you add more radix blocks you will have a bottle neck in form of lack of memory bandwith. If you make bigger radix block (4, 8, 16, ...), the access bandwidth to the memory will be reduced. The number of multiplications of complex or reell numbers as +-1 or +-i will be easy to implement.
For a radix-8 block we have 7 complex multiplications, 4 multiplications and 52 additions. In hardware we need 25 multipliers and 63 adders. Radix-8 need 7N/8 different complex coefficients.
For a radix-16 we have 15 complex multiplications, 20 reell multiplications and 148 adds. In hardware we need 65 multipliers and 193 adders. Radix-16 need 15N/16 different complex coefficients.
The choice of radix (2, 4, 8 ...) should always be taken in consideration and be a result from an carefully made system analysis.