 kierdavis.com Go back Open original

# Computing sin & cos in hardware with synthesisable Verilog

The CORDIC algorithm is a clever method for accurately computing trigonometric functions using only additions, bitshifts and a small lookup table.

## The Algorithm

It’s well known that rotating the vector $$(1, 0)$$ anticlockwise about the origin by an angle $$\theta$$ gives the vector $$(\cos \theta, \sin \theta)$$. We will use this as the basis of our algorithm:

function cordic(θ) { [c, s] ← rotate([1, 0], θ) return c, s
}

Let’s split up this big rotation into $$N$$ smaller rotations, with the angle of rotation in step $$i$$ given by $$\alpha_i$$. At the moment, we don’t care what the values of $$\alpha_i$$ are, we just want their sum to be equal to $$\theta$$.

α ← [ ...
] function cordic(θ) { c ← 1 s ← 0 for i in 0 .. N-1 { [c, s] ← rotate([c, s], α[i]) } return c, s
}

How do we rotate a vector? The page for rotation on Wikipedia tells us that it is equivalent to left-multiplying by a particular matrix:

α ← [ ...
] function cordic(θ) { c ← 1 s ← 0 for i in 0 .. N-1 { rotation_matrix ← [[cos α[i], -sin α[i]], [sin α[i], cos α[i]]] [c, s] ← rotation_matrix × [c, s] } return c, s
}

Let’s expand out this matrix multiplication:

α ← [ ...
] function cordic(θ) { c ← 1 s ← 0 for i in 0 .. N-1 { c_new ← cos α[i] × c - sin α[i] × s s_new ← sin α[i] × c + cos α[i] × s c ← c_new s ← s_new } return c, s
}

We will use the fact that $$\sin \alpha_i = \cos \alpha_i \tan \alpha_i$$ to factorise by $$\cos \alpha_i$$:

α ← [ ...
] function cordic(θ) { c ← 1 s ← 0 for i in 0 .. N-1 { c_new ← cos α[i] × (c - tan α[i] × s) s_new ← cos α[i] × (s + tan α[i] × c) c ← c_new s ← s_new } return c, s
}

Now, we will consider the values of $$\alpha_i$$. Let’s assume that each rotation angle $$\alpha_i$$ has a fixed magnitude $$\beta_i$$ with respect to $$\theta$$, but can be either positive or negative depending on the value of $$\theta$$. In this way, the rotating vector can be directed to converge on a particular result vector.

We will accomplish this using a new variable $$\phi$$, initially at 0. Each step, we compare $$\phi$$ to $$\theta$$. If $$\phi$$ is less than $$\theta$$, we need to rotate by a positive angle (i.e. $$\alpha_i = \beta_i$$). If $$\phi$$ is greater than $$\theta$$, we need to rotate by a negative angle ($$\alpha_i = -\beta_i$$).

β ← [ ...
] function cordic(θ) { c ← 1 s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } α ← direction × β[i] c_new ← cos α × (c - tan α × s) s_new ← cos α × (s + tan α × c) c ← c_new s ← s_new φ ← φ + α } return c, s
}

Let’s eliminate the variable $$\alpha$$ and use $$\beta_i$$ directly:

β ← [ ...
] function cordic(θ) { c ← 1 s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } c_new ← cos (direction × β[i]) × (c - tan (direction × β[i]) × s) s_new ← cos (direction × β[i]) × (s + tan (direction × β[i]) × c) c ← c_new s ← s_new φ ← φ + (direction × β[i]) } return c, s
}

$$\cos (-x) = \cos x$$ and $$\tan (-x) = -\tan x$$, so this program is equivalent to:

β ← [ ...
] function cordic(θ) { c ← 1 s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } c_new ← cos β[i] × (c - direction × tan β[i] × s) s_new ← cos β[i] × (s + direction × tan β[i] × c) c ← c_new s ← s_new φ ← φ + (direction × β[i]) } return c, s
}

Now what about the values of $$\beta_i$$? If we assign them to be such that $$\tan \beta_i = 2^{-i}$$, then we have:

β ← [ atan 2^0, atan 2^1, ... atan 2^(N-1)
] function cordic(θ) { c ← 1 s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } c_new ← cos β[i] × (c - direction × 2^(-i) × s) s_new ← cos β[i] × (s + direction × 2^(-i) × c) c ← c_new s ← s_new φ ← φ + (direction × β[i]) } return c, s
}

Multiplication by $$2^{-i}$$ is a shift-right by $$i$$ places:

β ← [ atan 2^0, atan 2^1, ... atan 2^(N-1)
] function cordic(θ) { c ← 1 s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } c_new ← cos β[i] × (c - direction × (s >> i)) s_new ← cos β[i] × (s + direction × (c >> i)) c ← c_new s ← s_new φ ← φ + (direction × β[i]) } return c, s
}

Let’s move the multiplications by $$\cos \beta_i$$ into a separate loop:

β ← [ atan 2^0, atan 2^1, ... atan 2^(N-1)
] function cordic(θ) { c ← 1 s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } c_new ← c - direction × (s >> i) s_new ← s + direction × (c >> i) c ← c_new s ← s_new φ ← φ + (direction × β[i]) } for i in 0 .. N-1 { c ← c × cos β[i] s ← s × cos β[i] } return c, s
}

In fact, all these multiplications can be replaced with a single value, which we will call $$K$$. $$K$$ is equal to the product of $$\cos \beta_i$$ for all $$i$$ between $$0$$ and $$N-1$$ inclusive. To put it mathematically,

$K = \prod_{i=0}^{N-1} \cos \beta_i$

It can be shown through the use of trigonometric identities that:

$K = \prod_{i=0}^{N-1} \frac{1}{\sqrt{1 + 2^{-2i}}}$

$$K$$ tends to approximately $$0.607252935$$ as $$N$$ increases.

Therefore, we have:

β ← [ atan 2^0, atan 2^1, ... atan 2^(N-1)
] K ← 1
for i in 0 .. N-1 { K ← K × cos β[i]
} function cordic(θ) { c ← K s ← 0 φ ← 0 for i in 0 .. N-1 { if φ < θ { direction ← 1 } else { direction ← -1 } c_new ← c - direction × (s >> i) s_new ← s + direction × (c >> i) c ← c_new s ← s_new φ ← φ + (direction × β[i]) } return c, s
}

Finally, we will change $$\phi$$ so that it now holds $$\theta$$ minus “whatever it was holding before this change”. This replaces the comparison against $$\theta$$ with a comparison against $$0$$, which is simpler to compute:

β ← [ atan 2^0, atan 2^1, ... atan 2^(N-1)
] K ← 1
for i in 0 .. N-1 { K ← K × cos β[i]
} function cordic(θ) { c ← K s ← 0 φ ← θ for i in 0 .. N-1 { if φ > 0 { direction ← 1 } else { direction ← -1 } c_new ← c - direction × (s >> i) s_new ← s + direction × (c >> i) c ← c_new s ← s_new φ ← φ - (direction × β[i]) } return c, s
}

What we now have is an algorithm that is possible to implement in hardware, but is still equivalent to the original algorithm.

## The implementation

We will assume that all numbers are stored as 32-bit fixed-point numbers, with the radix point between the second-most-significant and third-most-significant bits. This allows for numbers ranging between -2 and 2, which includes the interval $$-\frac{\pi}{2}$$ to $$\frac{\pi}{2}$$ that all other inputs can be reduced into.

First off, we declare the constants that we will use:

define K 32'h26dd3b6a // = 0.6072529350088814 define BETA_0 32'h3243f6a9 // = atan 2^0 = 0.7853981633974483
define BETA_1 32'h1dac6705 // = atan 2^(-1) = 0.4636476090008061
define BETA_2 32'h0fadbafd // = atan 2^(-2) = 0.24497866312686414
define BETA_3 32'h07f56ea7 // = atan 2^(-3) = 0.12435499454676144
define BETA_4 32'h03feab77 // = atan 2^(-4) = 0.06241880999595735
define BETA_5 32'h01ffd55c // = atan 2^(-5) = 0.031239833430268277
define BETA_6 32'h00fffaab // = atan 2^(-6) = 0.015623728620476831
define BETA_7 32'h007fff55 // = atan 2^(-7) = 0.007812341060101111
define BETA_8 32'h003fffeb // = atan 2^(-8) = 0.0039062301319669718
define BETA_9 32'h001ffffd // = atan 2^(-9) = 0.0019531225164788188
define BETA_10 32'h00100000 // = atan 2^(-10) = 0.0009765621895593195
define BETA_11 32'h00080000 // = atan 2^(-11) = 0.0004882812111948983
define BETA_12 32'h00040000 // = atan 2^(-12) = 0.00024414062014936177
define BETA_13 32'h00020000 // = atan 2^(-13) = 0.00012207031189367021
define BETA_14 32'h00010000 // = atan 2^(-14) = 6.103515617420877e-05
define BETA_15 32'h00008000 // = atan 2^(-15) = 3.0517578115526096e-05
define BETA_16 32'h00004000 // = atan 2^(-16) = 1.5258789061315762e-05
define BETA_17 32'h00002000 // = atan 2^(-17) = 7.62939453110197e-06
define BETA_18 32'h00001000 // = atan 2^(-18) = 3.814697265606496e-06
define BETA_19 32'h00000800 // = atan 2^(-19) = 1.907348632810187e-06
define BETA_20 32'h00000400 // = atan 2^(-20) = 9.536743164059608e-07
define BETA_21 32'h00000200 // = atan 2^(-21) = 4.7683715820308884e-07
define BETA_22 32'h00000100 // = atan 2^(-22) = 2.3841857910155797e-07
define BETA_23 32'h00000080 // = atan 2^(-23) = 1.1920928955078068e-07
define BETA_24 32'h00000040 // = atan 2^(-24) = 5.960464477539055e-08
define BETA_25 32'h00000020 // = atan 2^(-25) = 2.9802322387695303e-08
define BETA_26 32'h00000010 // = atan 2^(-26) = 1.4901161193847655e-08
define BETA_27 32'h00000008 // = atan 2^(-27) = 7.450580596923828e-09
define BETA_28 32'h00000004 // = atan 2^(-28) = 3.725290298461914e-09
define BETA_29 32'h00000002 // = atan 2^(-29) = 1.862645149230957e-09
define BETA_30 32'h00000001 // = atan 2^(-30) = 9.313225746154785e-10
define BETA_31 32'h00000000 // = atan 2^(-31) = 4.656612873077393e-10

Then, we introduce the module and its ports.

module cordic( angle, clock, // Master clock reset, // Master asynchronous reset (active-high) start, // An input signal that the user of this module should set to high when computation should begin angle_in, // Input angle cos_out, // Output value for cosine of angle sin_out // Output value for sine of angle
); input clock;
input reset;
input start;
input [31:0] angle_in;
output [31:0] cos_out;
output [31:0] sin_out; wire [31:0] cos_out = cos;
wire [31:0] sin_out = sin;

We need five registers: cos, sin and angle from the above algorithm, a counter register count, and a state register state. The first three are 32 bits wide, since they are storing fixed-point numbers as described above. count is 5 bits since 32 iterations will be used, and state is only 1 bit since we only need two states.

The inputs to this registers will be determined in a sequential logic block, so five more registers cos_next, sin_next, angle_next, count_next and state_next are defined, although these will be automatically reduced to logic functions during synthesis.

reg [31:0] cos;
reg [31:0] sin;
reg [31:0] angle;
reg [4:0] count;
reg state; reg [31:0] cos_next;
reg [31:0] sin_next;
reg [31:0] angle_next;
reg [4:0] count_next;
reg state_next; always @(posedge clock or posedge reset) begin if (reset) begin cos <= 0; sin <= 0; angle <= 0; count <= 0; state <= 0; end else begin cos <= cos_next; sin <= sin_next; angle <= angle_next; count <= count_next; state <= state_next; end
end

The two states that we will use are 0, an idle state, and 1, a state indicating computation is occurring.

The logic block is defined as follows:

always @* begin // Set all logic regs to a value to prevent any of them holding the value // from last tick and hence being misinterpreted as hardware registers. cos_next = cos; sin_next = sin; angle_next = angle; count_next = count; state_next = state; if (state) begin // Compute mode. cos_next = cos + (direction_negative ? sin_shr : -sin_shr); sin_next = sin + (direction_negative ? -cos_shr : cos_shr); angle_next = angle + (direction_negative ? beta : -beta); count_next = count + 1; if (count == 31) begin // If this is the last iteration, go back to the idle state. state_next = 0; end end else begin // Idle mode. if (start) begin cos_next = K; // Set up initial value for cos. sin_next = 0; // Set up initial value for sin. angle_next = angle_in; // Latch input angle into the angle register. count_next = 0; // Set up counter. state_next = 1; // Go to compute mode. end end
end

The cos_shr and sin_shr signals refer to the values of cos and sin shifted right by count places:

wire [31:0] cos_signbits = {32{cos}};
wire [31:0] sin_signbits = {32{sin}};
wire [31:0] cos_shr = {cos_signbits, cos} >> count;
wire [31:0] sin_shr = {sin_signbits, sin} >> count;

The direction_negative signal is high if direction in the pseudocode algorithm is negative (if $$φ < 0$$), and low if it is positive (if $$φ ≥ 0$$). It is therefore defined to simply be equal to the sign bit of angle:

wire direction_negative = angle;

Lastly, we will define the lookup table used for the beta values:

wire [31:0] beta_lut [0:31];
assign beta_lut = BETA_0;
assign beta_lut = BETA_1;
// ...
assign beta_lut = BETA_31; wire [31:0] beta = beta_lut[count];`