Chapter 5

FFT, Windowing, and Spectral Estimation

The window you pick is a confession of what you care about. Pick one before you understand the choice and your measurement is already wrong.

5.1 DFT vs FFT: Computational Heart of the RTSA

Every spectrum on every RTSA display starts with a discrete Fourier transform. The DFT takes $N$ complex samples in time and produces $N$ complex values in frequency. Each output bin is the projection of the input onto a complex sinusoid at that bin's frequency.

The textbook DFT formula is:

$$X[k] = \sum_{n=0}^{N-1} x[n] \cdot e^{-j 2\pi k n / N}, \quad k = 0, 1, \ldots, N-1$$

If you compute this directly, you do $N$ complex multiplies for each of $N$ outputs, for a total of $N^2$ multiplies. For a 4096-point transform, that is about 16.7 million multiplies. For a 1-million-point transform, it is a trillion multiplies. Real-time spectrum analysis at thousands of frames per second is impossible at $O(N^2)$.

The Fast Fourier Transform reduces this to $O(N \log N)$ multiplies. For 4096 points, the FFT does about 50 thousand multiplies instead of 16.7 million. A factor-of-340 speedup. For 1 million points, the speedup is a factor of 50,000. The FFT is the reason real-time spectrum analyzers exist.

The most common FFT algorithm is Cooley-Tukey radix-2. It requires $N$ to be a power of 2 and decomposes the transform recursively into pairs of half-length transforms. Modern FPGAs implement it in pipelined hardware that ingests one sample per clock and emits a complete spectrum every $N$ clocks. At 2 GHz clock and 4096-point transforms, that is roughly 488 thousand spectra per second per pipeline.

Bin Frequency and Bin Spacing

The output of an FFT lives on a discrete frequency grid. Each bin corresponds to a frequency:

$$f_k = \frac{k \cdot f_s}{N}$$

for $k = 0, 1, \ldots, N/2 - 1$ (positive frequencies in a complex FFT). The spacing between adjacent bins is:

$$\Delta f = \frac{f_s}{N}$$

This is the resolution bandwidth of the FFT, before windowing.

What the FFT Is Actually Telling You

Each FFT output bin is a complex number whose magnitude is the amplitude of the input at that bin's frequency, and whose phase is the phase of the input at that frequency. The full spectrum is therefore both an amplitude spectrum and a phase spectrum. Most displays show only the amplitude. Phase is used for modulation analysis, group delay, and cross-spectrum measurements.

A practical RTSA also computes power spectral density (PSD), which normalizes the FFT magnitude squared for window energy and bin width:

$$\text{PSD}[k] = \frac{|X[k]|^2}{f_s \cdot \sum_n w[n]^2}$$

where $w[n]$ is the window function. PSD has units of dBm/Hz and lets you compare measurements at different RBWs on a common axis.

5.2 Window Functions

A finite-length FFT sees only $N$ consecutive samples of an infinite-length signal. The transition at the start and end of that window introduces artifacts. Window functions are how we manage those artifacts.

Why Windowing Exists

Imagine sampling a pure sinusoid at exactly 100.0 MHz with a sample rate that gives a bin centered at exactly 100.0 MHz. The signal lines up perfectly. The FFT produces a single bin of energy at 100.0 MHz and zero everywhere else. Beautiful.

Now shift the signal to 100.05 MHz. The sinusoid no longer fits an integer number of cycles in the window. The discontinuity at the window boundaries causes the FFT to produce a spread of energy across many bins. This is spectral leakage. Energy that belongs at one frequency leaks into the entire spectrum.

A window function tapers the input signal to zero at the start and end of the window before the FFT runs. The taper smooths the boundary discontinuity, dramatically reducing leakage. The cost is a wider main lobe at every spectral peak.

Figure 5-1
Figure 5-1. Time-domain shapes (left) and frequency responses (right) of common window functions. Rectangular has the narrowest main lobe but worst sidelobe suppression. Flat-top has the widest main lobe but the best amplitude accuracy. Blackman-Harris balances dynamic range and main lobe width. Hann is the general-purpose default.

The Window Menu

WindowMain lobe -3 dB (bins)Highest sidelobe (dB)Best for
Rectangular0.89-13tones aligned to bins
Hann (Hanning)1.44-32general-purpose
Hamming1.30-43low-noise tone detection
Blackman1.68-58strong sidelobe suppression
Blackman-Harris (4-term)1.90-92extreme dynamic range
Flat-top3.86-88accurate amplitude
Kaiser ($\beta=14$)2.39-120tunable extreme suppression

Pick by use case. Hann is the default for general spectrum monitoring. It is fast to compute, gives modest sidelobe suppression, and is fine for most measurements. Flat-top is the right choice when you need accurate amplitude readings. Its main lobe is wide enough that a tone landing anywhere in a bin reads at almost the same amplitude. Use it when measuring the absolute power of a known tone, calibrating, or doing compliance work. Blackman-Harris is for high dynamic range measurements such as searching for weak signals near strong carriers. Wide main lobes, very low sidelobes. Kaiser is the sniper rifle: parameter $\beta$ continuously adjusts the tradeoff. Use when none of the standard windows fit.

In Aaronia RTSA Suite PRO, the window is a per-display setting. Each display (live, waterfall, persistence) can use a different window because each has different priorities. The waterfall might use Hann for fast updates while the high-resolution view uses Blackman-Harris for dynamic range.

The Math

For a window of length $N$, the Hann window is:

$$w[n] = 0.5 \cdot \left(1 - \cos\left(\frac{2\pi n}{N-1}\right)\right)$$

Hamming differs slightly:

$$w[n] = 0.54 - 0.46 \cos\left(\frac{2\pi n}{N-1}\right)$$

Blackman is a 3-term cosine sum, Blackman-Harris a 4-term, and Kaiser uses modified Bessel functions. Each is computed once per FFT length and stored as a table. The runtime cost of windowing is negligible.

5.3 Spectral Leakage, Scalloping Loss, and ENBW

Three artifacts of finite-length spectral estimation deserve named attention.

Spectral Leakage

Leakage is energy that belongs at one frequency appearing at others. Without a window, leakage extends across the entire spectrum at -13 dB or worse. With a Hann window, leakage drops to -32 dB at the nearest sidelobe and falls off as $1/f^3$.

Leakage matters when you are looking for a weak signal near a strong one. A strong carrier without windowing will hide a weak adjacent signal up to 13 dB below it. With Blackman-Harris, the same carrier hides only signals below -92 dB.

Scalloping Loss

Scalloping is the variation in measured amplitude as a tone moves between bin centers. With a rectangular window, a tone exactly between two bins reads about 4 dB lower than the same tone exactly on a bin. With a Hann window, the variation drops to 1.4 dB. With a Flat-top window, scalloping is essentially eliminated, less than 0.01 dB.

This is why Flat-top is the window for accurate amplitude. The wide main lobe is a feature, not a bug. It guarantees that a tone anywhere in the bin reads at the correct amplitude.

Equivalent Noise Bandwidth (ENBW)

When you measure the noise floor of a signal, the noise power is integrated across the equivalent noise bandwidth of the analyzer:

$$\text{ENBW} = \frac{\sum_n w[n]^2}{(\sum_n w[n])^2 / N} \cdot \Delta f$$

ENBW is wider than the bin spacing $\Delta f$ for any practical window. For Hann, ENBW is 1.5 times the bin spacing. For Flat-top, it's 3.77 times. When you compare noise floors across different window choices, you need to normalize by ENBW or you'll get the wrong answer. Most modern RTSAs do this normalization automatically and display dBm/Hz instead of dBm-per-bin. The user does not see the math. The math is still happening.

5.4 Overlap-Add and Overlap-Save Processing

We met overlap in Chapter 2 as the mechanism for gap-free observation. Overlap-add and overlap-save are two specific implementations.

In overlap-add, successive FFTs share a fraction of their samples. With 50 percent overlap, FFT $k$ uses samples $[kN/2, kN/2 + N - 1]$, and FFT $k+1$ uses $[kN/2 + N/2, kN/2 + 3N/2 - 1]$. Each input sample is in two consecutive FFT windows. The result is that the power spectra can be averaged across overlapping FFTs without bias from the window taper.

Overlap-save is the dual: you compute longer FFTs and discard portions of the output that are corrupted by circular convolution effects. This is used in fast convolution implementations, such as digital channelizer filters in an RTSA. The user never sees this layer; it is hidden in the FPGA.

For RTSA displays, the standard overlap is 50 percent for waterfall (faster updates) and 75 percent for high-priority live trace (smoother spectra). Higher overlaps (87.5 or 93.75 percent) appear in research-grade instruments where every dB of dynamic range matters. Aaronia's RTSA Suite PRO supports user-selectable overlap up to 93.75 percent, with the practical limit set by host CPU/GPU compute.

5.5 Welch's Method and Averaging Strategies

A single FFT of a noisy signal is itself noisy. The variance of the spectral estimate at any bin is comparable to the mean. To get a smooth, reliable noise floor, you average.

Welch's method is the standard. Divide the long input record into overlapping segments of length $N$. Apply a window to each segment. Compute the FFT magnitude squared (the periodogram) for each segment. Average the periodograms across segments.

The result is a power spectral density estimate with reduced variance. With $K$ averages, the variance drops by a factor of approximately $K$ (for non-overlapped segments). With overlapped segments, the reduction is slightly less than $K$ because adjacent segments share information.

In practice, an RTSA displays a running Welch estimate in three forms: live trace (the latest FFT only, maximum responsiveness, maximum noise), averaged trace (a moving average across the last K FFTs, smooth noise floor, slight lag for transients), max hold (the bin-wise maximum across all FFTs since the trace started, catches transients but builds up over time), and min hold (the bin-wise minimum, reveals the noise floor and rejects intermittent strong signals). These four traces overlay on the same display. Each tells you a different thing. A practiced engineer reads them simultaneously.

When to use each: live + averaged for general monitoring, max hold when hunting for intermittent signals, min hold when characterizing the noise floor or rejecting a known interferer, persistence (Chapter 2) when you want a probability density rather than a max or average. Welch is a tool. So is max hold. So is persistence. Each captures a different statistical aspect of the signal. The cost of an RTSA is paying for the compute that runs all of them in parallel; the value is having all of them at your fingertips when troubleshooting.

5.6 Choosing FFT Length for a Given Measurement

The user's most common question is: "what FFT length should I use?" The answer depends on what you are measuring.

For a Known Continuous Signal

Use the longest FFT you can afford. Long FFTs give narrow RBW, which means lower noise floor (because less noise integrates into each bin) and higher dynamic range. The penalty is slow updates, but for a continuous signal that does not matter.

For a 5G NR carrier viewed at 30 kHz subcarrier resolution, a 65,536-point FFT at 245 MS/s sample rate gives 3.7 kHz RBW, which is fine enough to resolve every subcarrier with margin. Each FFT takes 33 milliseconds to acquire, so the trace updates 30 times per second. Plenty fast for visual confirmation.

For an Intermittent Burst

Use a short FFT. A 1024-point FFT at 245 MS/s acquires in 4.2 microseconds, fast enough to land at least one window inside most realistic bursts. The RBW is 240 kHz, which is coarse but acceptable for burst visualization in a waterfall.

For Accurate Amplitude

Use a Flat-top window with a moderate FFT length. The Flat-top eliminates scalloping; the moderate length keeps the noise floor reasonable. 4096 points at 245 MS/s gives 60 kHz RBW with ENBW of 226 kHz under Flat-top. Amplitude readings are within 0.02 dB regardless of where the tone falls in the bin.

For Modulation Analysis

Match the FFT length to the symbol rate or subcarrier spacing. For 5G NR with 30 kHz subcarriers, you want bins of 7.5 kHz or finer (a quarter of subcarrier spacing). At 245 MS/s, that is at least 32,768 points. RTSA Suite PRO modulation analysis configures this automatically based on the standard you select.

For Phase Noise Measurements

Use the longest FFT you can sit through. Phase noise sidebands close to a carrier need fine RBW to resolve. A 1 MHz offset measurement with 100 Hz RBW requires a million-point FFT. Modern RTSAs like the SPECTRAN V6 PLUS handle this through their high-resolution analysis path, often computed offline from streamed I/Q.

Aaronia in Practice: Presets in RTSA Suite PRO

RTSA Suite PRO ships with a library of measurement presets, each preconfigured with the right window, FFT length, overlap, and averaging strategy for a specific task. "5G NR EVM," "BLE channel survey," "ISM compliance," "Phase noise scan," and dozens more. The user picks a preset, points the instrument at the signal, and gets a correct measurement without needing to know which window or FFT length is appropriate.

For practitioners who want to dig deeper, every preset is editable and saveable. Inherit a preset, tweak the FFT length for your specific scenario, save it under a new name. The compute pipeline stays valid because the framework forces consistent ENBW normalization. This is where the visual-graph approach (Chapter 4) and the preset library combine. Beginners get correctness from presets; experts get flexibility from the graph. Both work in the same instrument.

Chapter Summary

End-of-Chapter Quiz

Check your understanding

The Chapter 5 questions are now an interactive quiz. Pick an answer for each, get instant scoring, and see why each answer is right. Your progress is saved on this device.

Take the interactive quiz →