Understanding Fast Fourier Transform from scratch — to solve Polynomial Multiplication.

By Aiswarya Prakasan

Fast Fourier Transform is a widely used algorithm in Computer Science. It is also generally regarded as difficult to understand. I have spent the last few days trying to understand the algorithm — how it works and why. In the process I digressed to various other mathematical topics to build a complete understanding ground up. Here, I have tried to collate and document everything I have learnt. In this blog, we will use FFT (Fast Fourier Transform) to solve the problem of quickly multiplying two polynomials. I must warn that this is a long (and tedious?) post. But, you should be able to read and understand everything from start to end without having to look anything up.

Prerequisites — You do not need to know Fourier Transform to understand this blog. However, a basic understanding of the following is required -

  • Polynomials
  • Complex Numbers
  • Trigonometry
  • Matrix Multiplication
  • Algorithm Complexity Analysis (Big O notation) — You are free to skip these parts and it shouldn’t affect the understanding of working of the algorithm.

Problem — Given two polynomials- A(x) =a0+a1x+a2x^2+… anx^n and B(x) =b0+b1x+b2x^2+… bnx^n, find the polynomial C(x) = A(x)*B(x).

E.g., multiply (x^2 + 2x + 3)(2x^2 + 5) = 

2x^2 + 0x + 5
1x^2 + 2x + 3
6 0 15
4 0 10
2 0 5
2x^4 + 4x^3 + 11x^2 +10x+15

More generally, when C(x) = A(x)*B(x) and C(x) =c0+c1x+c2x^2+... c2nx^2n .

ci = a0*bi + a1*bi-1 + ... + ai*b0.

If we think of A and B as vectors, then C vector is called 'Convolution' of A and B (represented as {A⊗B}). Make sure that the polynomials A and B are appropriately appended with 0s to make each of their degree equal to 2n (degree of C). Straightforward computation is O(n^2) time

Fourier Transform and Convolution Theorm

Fourier analysis is the decomposition of any mathematical function into a series of sine and cosine waves (sinusoids). If you wish to learn more, here are some links that help build the intuition — a quora answer on fft, an interactive guide to Fourier Transform.

We generally represent a function in its Fourier Transform form, as it is easier to work with.

The Convolution Theorm : states that under suitable conditions the Fourier transform of a convolution (denoted by ‘⊗’) is the point-wise product of Fourier transforms. Let f and g be two functions with convolution {f⊗ g}. Fourier transform be denoted by the operator ‘ζ’ so ζ(f) and ζ(g) are the Fourier transforms of f and g, respectively. -

ζ(f⊗g) = ζ(f).ζ(g)

The convolution operation is converted to point-wise multiplication (we will discuss what point-wise multiplication means).

Representation of Polynomials

The representation of Polynomial we saw above is called the coefficient representation. Another way of representing a polynomial is called point-value representation. A polynomial of degree n-1 can be uniquely represented by its values at at-least n different points. This is a result of the Fundamental Theorm of Algebra that states — “A degree n-1 polynomial A(x) is uniquely specified by its evaluation at n distinct values of x”.

A polynomial A(x) =a0+a1x+a2x^2+… anx^n can be represented as

A set of n pairs {(x0, y0),(x1, y1),...,(xn, yn)} such that 
• for all i != j, xi != xj. ie, the points are unique.
• for every k, yk = A(xk);

In point-value form, multiplication 𝐶(𝑥) = 𝐴(𝑥)𝐵(𝑥) is given by 𝐶(𝑥𝑘) = 𝐴(𝑥𝑘) .𝐵(𝑥𝑘) for any point (𝑥𝑘).

If 𝐴 and 𝐵 are of degree-bound 'n', then 𝐶 is of degree-bound '2n'.
Need to start with “extended” point-value forms for 𝐴 and 𝐵 consisting of
2𝑛 point-value pairs each. –

𝐴: (𝑥0, 𝑦0) , (𝑥1, 𝑦1) , … ,( 𝑥2𝑛−1, 𝑦2𝑛−1)
𝐵: (𝑥0, 𝑦0') , (𝑥1, 𝑦1 ′) , … , (𝑥2𝑛−1, 𝑦2𝑛−1 ′)
𝐶: (𝑥0, 𝑦0𝑦0 ′) ,( 𝑥1, 𝑦1𝑦1 ′) , … ,( 𝑥2𝑛−1, 𝑦2𝑛−1𝑦2𝑛−1 ′)

For example,
𝐴(𝑥) = 𝑥 ^3 − 2𝑥 + 1
𝐵 𝑥 = 𝑥 ^3 + 𝑥^ 2 + 1
𝑥𝑘 = (−3, −2, −1, 0, 1, 2, 3) We need 7 coefficients!
𝐴: (−3,−17) ,( −2,−3) ,( −1,1) ,( 0, 1) ,( 1, 0) , (2, 5) , (3, 22)
𝐵: (−3,−20) ,(−2,−3) , (−1, 2) ,( 0, 1) ,( 1, 3) ,( 2, 13) ,( 3, 37)
𝐶: (−3,340) , (−2,9) , (−1,2) ,( 0, 1) ,( 1, 0) ,( 2, 65) ,( 3, 814)

This is point-wise multiplication. The time to multiply two polynomials of degree-bound n in point-value form is Θ(n).The inverse of evaluation--determining the coefficient form of a polynomial from a point-value representation--is called interpolation. This is what the Convolution Theorm states with respect to Fourier Transform of a function.

The Road so far

Now the efficiency of our algorithm depends on the efficiency of conversion between the two representation -

This is where we use the FFT algorithm.

An Overview of the Algorithm

Let m = 2n-1. [so degree of C is less than m]
1. Pick m points x_0, x_1, ..., x_{m-1} chosen cleverly.
2. Evaluate A at each of the points: A(x_0),..., A(x_{m-1}).
3. Same for B.
4. Now compute C(x_0),..., C(x_{m-1}), where C is A(x)*B(x)
5. Interpolate to get the coefficients of C.

At a glance, it looks like points 2 and 3 from the Algorithm takes O(n^2) time. However, the FFT will allow us to quickly move from coefficient representation of polynomial to the point-value representation, and back, for our cleverly chosen set of m points. (Doesn’t work for arbitrary m points. The special points will turn out to be the roots of unity).

A Detour to Roots of Unity

The word “root” in the term refers to square roots, cube roots, and any other roots you might happen to need. For any integer n, the nth root of a number k is a number that, when multiplied by itself n times, yields k. The word “unity,” means “one.” So a root of unity is any number which, when multiplied by itself some number of times, yields 1. To get any root of unity, other than 1, we need to delve into Complex Numbers.

We can think of the set of complex numbers as a 2-dimensional plane. We specify a complex number with two coordinates the same way we would on a graph: the point (1,1) refers to the number 1+i. If we were standing on the point (0,0), we probably wouldn’t want to walk one unit over and then one unit up to get to the point (1,1). Instead, we’d take a diagonal by walking √2 units at an angle of 45 degrees to the x-axis. Notationally, when we use this radial distance and direction, we write a complex number as re^i θ, where r is the distance and θ is the direction, usually written in radians rather than degrees. There are 2π radians in a whole circle, so the number 1 is written e^i 2π. The angle 45 degrees is π/4 radians, so the point (1,1) that we found above would be written as √2e^i π/4.

This method of specifying a point with a length and a direction is called using polar coordinates.
If a point is represented as re^iθ, the x-co-ordinate(real part) is given by rcosθ, and the y-co-ordinate (imaginary part) is given by rsinθ. This follows from basic trigonometric identities.

  polar coordinates : re^iθ
cartesian coordinates :(rcosθ, rsinθ)

Multiplication with the polar notation of re^iθ, is really easy: 
you multiply the distances together and add the angles. So 5e^iπ/6× 2e^iπ/3=10e^i π/2. So multiplying involves both expansion or contraction (that’s the distance part) and rotation (that’s the angle part).

If we have a number written re^i θ which, when multiplied by itself a certain number of times, yields 1, the distance r itself must be 1. If r were larger than 1, say 2, then as we multiplied the number by itself more and more times, its distance from (0,0) would go from 2 to 4 to 8 and on and on, spiralling out to infinity. If r were smaller than 1, say 1/2, the point would spiral in to 0: 1/2, 1/4, 1/8, and so on. 1 is the only radial distance that will stay perfectly balanced, just moving around the circle as we multiply the numbers together.

To make this more concrete, I happen to know that e^i2π/7 is a root of unity. When I raise it to the 7th power, I get e^i2π, which is 1. Each time I multiply e^i2π/7 by itself (product =e^i(2*2π/7)) , I rotate 1/7 of the way around the circle. In fact, as I multiply e^i2π/7 by itself successively, I get this picture -

In fact, there are seven 7th roots of unity, and each gold disc in that picture is one of them. We can get an nth root of unity for any number n by replacing the 7 in e^i2π/7 by n. The pictures for other roots of unity look a lot like that diagram above, they just have a different number of gold discs.

There are exactly n complex nth root of unity e^2𝜋𝑖𝑘/𝑛 for k = 0, 1, … , 
𝑛 − 1 where e^iu= cos (u) + 𝑖 sin(u). Here ‘u’ represents an angle in radians.

Visualising 8th roots of unity-

Some Interesting Properties of Roots of Unity

  1. Principal nth Root of Unity
  • The value ω𝑛 = e^2𝜋𝑖/𝑛 is called the principal nth root of unity.
  • All of the other complex nth roots of unity are powers of 𝜔𝑛.
  • The n complex nth roots of unity, 𝜔𝑛^0 , 𝜔𝑛^1 , … ,𝜔𝑛^n-1 form a group under multiplication that has the same structure as (ℤn, +) modulo n.
  • 𝜔𝑛^𝑛 = 𝜔𝑛^0 = 1 implies
    – 𝜔𝑛^𝑗.𝜔𝑛^𝑘 = 𝜔𝑛^(𝑗+𝑘) = 𝜔𝑛^((𝑗+𝑘) mod n)
    𝜔𝑛^-1 = 𝜔𝑛^n-1

2. Cancellation Lemma

  • For any integers n ≥ 0, k ≥ 0, and b > 0, 𝜔𝑑𝑛^𝑑𝑘 = 𝜔𝑛^𝑘.
  • Proof : 𝜔𝑑𝑛^𝑑𝑘 = (𝑒^(2𝜋𝑖/dn))^dk = 𝑒^(2𝜋𝑖/nk) = 𝜔𝑛^𝑘
  • For any even integer n > 0, 𝜔𝑛^(𝑛/2) = 𝜔2 = −1.
  • Example 𝜔24^6 = 𝜔4
    – 𝜔24^6 = (𝑒^(2𝜋𝑖/24))6 = 𝑒^(2𝜋𝑖*6/24) = 𝑒^2𝜋𝑖/4 = 𝜔4

3. Halving Lemma

  • If n > 0 is even, then the squares of the n complex n-th roots of unity are the n/2 complex n/2-th roots of unity.
  • Proof : By the cancellation lemma, we have( 𝜔𝑛 ^𝑘) 2 = 𝜔𝑛/2^ 𝑘 for any non-negative integer, 𝑘.
  • If we square all of the complex n-th roots of unity, then each n/2^th root of unity is obtained exactly twice .
    – (𝜔𝑛^(𝑘+𝑛/ 2))2 = 𝜔𝑛^(2𝑘+𝑛) = 𝜔𝑛^2𝑘*𝜔𝑛^𝑛 = 𝜔𝑛^2𝑘 = (𝜔𝑛^k)^2
    –Thus, 𝜔𝑛^𝑘 and 𝜔𝑛^(𝑘+𝑛/2) have the same square.
As we will see, the fast Fourier transform algorithm cleverly makes 
use of the following properties about ωn:
𝜔𝑛^n =1
𝜔𝑛^n+𝑘 = 𝜔𝑛^𝑘
𝜔𝑛^n/2 = -1
𝜔𝑛^(n/2+𝑘) = -𝜔𝑛^

Discrete Fourier Transform

Since we have broken up the problem, our goal now is to evaluate a given polynomial, A (x)(degree <m) at m points of our choosing in total time O(m log m). Assume m is a power of 2.

Let’s first develop it through an example.

Say m=8, so we have a polynomial
A(x) = a_0 + a_1 x + a_2 x² + a_3 x³ + a_4 x⁴ + a_5 x⁵ + a_6x⁶ + a_7x⁷.
(as a vector, A = [a_0, a_1, …, a_7])

And we want to evaluate at eight points of our choosing. Here is an idea. Split A into two pieces, but instead of left and right, have them be even and odd. So, as vectors,
A_even = [a_0, a_2, a_4, a_6]
A_odd = [a_1, a_3, a_5, a_7]

or, as polynomials:

A_even(x) = a_0 + a_2 x + a_4 x² + a_6 x³
A_odd(x) = a_1 + a_3 x + a_5 x² + a_7 x³.

Each has degree < m/2. How can we write A(x) in terms of A_even and A_odd?
A(x) = A_even(x²) + x A_odd(x²).
A(-x) = A_even(x²) — x A_odd(x²).

Now, instead of calculating the value for 1 polynomial of degree m, we now need to calculate 2 polynomials of degree (m/2) and then combine them.

Let T(m) = time to evaluate a degree-m polynomial at our special set of m points. We’re doing this by evaluating two degree-m/2 polynomials at m/2 points each (the squares), and then doing O(m) work to combine the results. This is great because the recurrence T(m) = 2T(m/2) + O(m) solves to O(m log m).

What’s nice is that the effort spent computing A(x) will give us A(-x) almost for free. So, we need to specially choose m points that will have the property:

The 2nd half of points are the negative of the 1st half   (***)

E.g., {1,2,3,4,-1,-2,-3,-4}.

But, we’re deluding ourselves by saying “just do it recursively”. Why is that? The problem is that recursively, our special points (now {1, 4, 9, 16}) have to satisfy property (***). E.g., they should really look like {1, 4, -1, -4}. But these are squares! To fix these we use complex numbers. E.g., if these are the squares, what do the original points look like?

{1, 2, i, 2i, -1, -2, -i, -2i}

so then their squares are: 1, 4, -1, -4
and their squares are: 1, 16

But, at the next level we again need property (***). So, we want to have {1, -1} there. This means we want the level before that to be {1, i, -1, -i}, which is the same as {1, i, i^2, i^3}.

So, for the original level, let ω = the primitive 8th root of unity, and then our original set of points will be:

1,ω ,ω^2,ω^3,ω^4 (= -1),ω^5 (= -ω ),ω^6 (= -ω^2),ω^7 (= -ω^3)

so that the squares are: 1, i, i² (= -1), i³ (= -i)
and *their* squares are: 1, -1
and *their* squares are: 1

In general, the mth primitive root of unity is the vector
ω = cos(2*pi/m) + i*sin(2*pi/m)

Fast Fourier Transform (FFT) 
The problem of evaluating 𝐴(𝑥) at 𝜔𝑛^0 , 𝜔𝑛^1 , … , 𝜔𝑛^𝑛−1 reduces to 
1. evaluating the degree-bound 𝑛/2 polynomials 𝐴even(𝑥) and 𝐴odd(𝑥) at the points (𝜔𝑛^0)^2 ,(𝜔𝑛^1)^2 , … , (𝜔𝑛^𝑛−1)^2.
2. combining the results by 𝐴(𝑥) = 𝐴even(𝑥2) + 𝑥𝐴odd(𝑥2).

Why bother?
– The list (𝜔𝑛^0)^2 ,(𝜔𝑛^1)^2 , … , (𝜔𝑛^𝑛−1)^2 does not contain 
𝑛 distinct values, but 𝑛/2 complex 𝑛/2-th roots of unity.
– Polynomials 𝐴even and 𝐴odd are recursively evaluated at the 𝑛/2 complex 𝑛/2-th roots of unity.
– Subproblems have exactly the same form as the original problem, but are half the size

A(x) = 3+2x+3x^2+4x^3
A(ω4^0) = A(1) = 3+2+3+4=12
A(ω4^1) = A(i) = 3+2i-3+4i=6i
A(ω4^2) = A(-1) = 3–2+3–4=0
A(ω4^3) = A(-i) = 3–2i-3+4i=2i

Using A(x)= Aeven(x^2) + xAodd(x^2)
Aeven(x) = 3+3x
Aodd(x) = 2+4x
A(ω4^0)= A(1) = Aeven(1)+1.Aodd(1)= 3+3+2+4=12
A(ω4^1)= A(i) = Aeven(-1)+i.Aodd(-1)=3–3+2i+4i=6i
A(ω4^2)= A(-1) = Aeven(1)-1.Aodd(1)= 3+3–2–4=0
A(ω4^3)= A(-i)= Aeven(-1)-i.Aodd(-1)=3–3–2i+4i= 2i

Here is the general algorithm in pseudo-C:

Let A be array of length m, w be primitive mth root of unity.
Goal: produce DFT F(A): evaluation of A at 1, w, w^2,...,w^{m-1}.
FFT(A, m, w)
if (m==1) return vector (a_0)
else {
A_even = (a_0, a_2, ..., a_{m-2})
A_odd = (a_1, a_3, ..., a_{m-1})
F_even = FFT(A_even, m/2, w^2)//w^2 is a m/2-th root of unity
F_odd = FFT(A_odd, m/2, w^2)
F = new vector of length m
x = 1
for (j=0; j < m/2; ++j) {
F[j] = F_even[j] + x*F_odd[j]
F[j+m/2] = F_even[j] - x*F_odd[j]
x = x * w
return F

Note : To apply this algorithm, ‘m’ should be a power of 2.We should pad
extra zeros to the polynomial to ensure that length of the array is a
power of 2. This algorithm returns an array of complex numbers which is
then used for point-wise multiplication.

Here is the tree of input vectors to the recursive calls of the FFT procedure. The initial invocation is for n = 8.

Inverse Fourier Transform

Once we perform point-wise multiplication on the Fourier Transforms of A and B and get C (an array of Complex Numbers), we need to convert it back to coefficient form to get the final answer.

We use the Inverse Fourier Transform to get the coefficient form of C (Don’t worry about the imaginary part of the complex number. Once the Inverse Fourier Transform is performed, we get an array of Complex Numbers with the imaginary part almost equal to 0. Hence ,it can be ignored.

What we have actually done so far (computing A(x) at 1, ω,ω2, …,ω{m-1}) can be represented as-

This can be represented in the matrix form as -

Lets use F to denote the matrix of primitive roots. In order to retrieve the coefficients, we need to get F^-1(inverse of matrix F) , and multiply it with the y- matrix. It turns out that F^-1 looks the same. In place of ω, the inverse Fourier Matrix uses ω^-1.

if ω = a+ib   => ω^-1 = a-ib
if ω = e^2Πi/k => ω^-1 = e^-2Πi/k

so, F^-1

Based on the above observation, we can still apply FFT by replacing a with y, y with a, ωn with ωn^−1 (that is, ωn^n-1 ), and scaling the result by 1/n .

So, the final algorithm is:

Let F_A = FFT(A, m, ω) // time O(n log n)
Let F_B = FFT(B, m, ω) // time O(n log n)
For i=1 to m, let F_C[i] = F_A[i]*F_B[i] // time O(n)
Output C = 1/m * FFT(F_C, m, ω-1). // time O(n log n)

I will be adding the full Java code for polynomial multiplication in a couple of hours.

References :