Timing Considerations
Closing Timing with High DSP Block Utilization
If your design has a >50% of the DSP Blocks implemented with EFX_DSP24 or EFX_DSP12 primitives, the fMAX can vary significantly depending on the placement seed. Therefore, it is a good idea to try 3 or 4 seeds to see if it helps with timing closure, more so than for a typical design.
Wide Multiplier Handling
When a multiplier is in your design's critical path, you may try different synthesis options to handle the decomposed mult-add structure:
--mult-auto-pipelineautomatically adds pipeline registers in the DSP to reduce the critical path delay but increase latency. See Synthesis Options for a detailed description of this option's behavior.--mult-decomp-retimeperforms backward retiming to reduce the critical path delay. Extra registers should be added to the output of the wide multiplier to allow for retiming to occur.- Setting the maximum cascade connected DSP blocks to 2
(
--max-cc-dsp48=2) may in some cases reduce the critical path delay.
For wide multipliers that exceed the DSP width by a small amount (e.g., 19x19, 20x20), the expanded partial sums or partial products may be too small to efficiently implemented by DSP or CARRY logic. In this case, some of them may be blasted into LUTs. The bit-blasting may hinder the use of the aforementioned command line options for better fMAX results. In such a case, you may choose to disabile small MULT or ADD blasting by setting the following command line options to 0:
--small-adder-limit(Default: 4)--small-mult-limit(Default: 2)