This is a rehash of something I wrote in a forum post something like 10 years ago. It turns out that forum prohibited archiving in its `robots.txt`

so it’s not on the Internet Archive. The original operator of said forum hosted a copy (with broken formatting) of the original posts for a while, but at a different URL, breaking all links. And now that copy’s gone too, again with archiving disabled apparently. Sigh.

I got asked about this yesterday; I don’t have a copy of my original derivation anymore either, but here’s my best attempt at reconstructing what I probably wrote back then, and hopefully it won’t get lost again this time.

Suppose you’re given a unit quaternion and a vector . Quaternions are a common rotation representation in several fields (including computer graphics and numerical rigid-body dynamics) for reasons beyond the scope of this post. To apply a rotation to a vector, one computes the quaternion product , where is implicitly identified with the quaternion with real (scalar) part 0 and as its imaginary part, and denotes the conjugate of . Such quaternions with a real part of 0 are also referred to as “pure imaginary” quaternions. Anyway, the result of the above product is another pure imaginary quaternion, corresponding to the rotated vector.

This is all explained and motivated elsewhere, and I won’t bother doing so here. Now generally, you often want to apply the same transformation to many vectors. In that case, you’re better off turning the quaternion into the equivalent 3×3 rotation matrix first. You can look up the formula elsewhere or work it out from the expression above with some algebra. That’s not the topic of this post either.

But sometimes you really only want to transform a single vector with a quaternion, or have other reasons for not wanting to expand to an explicit 3×3 (or larger) matrix first, like for example trying to minimize live register count in a shader program. So let’s look at ways of doing that.

### The direct way

First, I’m going to split quaternions in their real (scalar) and imaginary (vector) parts, writing where is the real part and imaginary. The conjugate of is given by .

The product of two quaternions and is given by

where denotes the usual dot product and the cross product. If you’re not used to seeing this in vector notation, I recommend using this (or something more abstract like geometric algebra) for derivations; writing everything in terms of scalars and the basis elements makes you miss the forest for the trees.

Anyway, armed with this formula, we can compute our final product without too much trouble. Let’s start with the part:

Not so bad. Now we have to multiply it from the right by , which I’ll do in multiple steps. First, let’s take care of the real part, by multiplying our just-computed values for with using the general multiplication formula above:

because the first two dot products are identical and the cross product of and is perpendicular to . This proves that is indeed a pure imaginary quaternion again, just like the we started out with.

Nice to know, but of course we’re actually here for the vector part:

which so far has used nothing fancier than the antisymmetry of the cross product .

If we pull out and name the shared cross product, we get

which is the direct expression for the transformed vector from the formula. This is what you get if you just plug everything into the formulas and apply basic algebraic simplifications until you run out of obvious things to try (which is exactly what we did).

In terms of scalar operation count, this boils down to two cross products at 6 multiplies and 3 additions (well, subtractions) each; one 3D dot product at 3 multiplies and 2 additions; 3 vector-by-scalar multiplications at 3 multiplies each; two scalar multiplies (to form and , although the latter can also be computed via addition if you prefer); and finally 3 vector additions at 3 adds each. The total operation count is thus 26 scalar multiplies and 17 additions, unless I miscounted. For GPUs, a multiply-add “a*b+c” generally counts as a single operation, and in that case the scalar operation count is 9 scalar multiplies and 17 scalar multiply-adds.

### Stand back, I’m going to try algebra!

We can do better than that. So far, we haven’t used that is a unit quaternion, meaning that . We can thus plug in for , which yields:

This might look worse, but it’s progress, because we can now apply the vector triple product identity

with , , , leaving us with:

The two remaining terms involving both multiply by two, so we use a slightly different shared temporary for the final version of our formula:

Final scalar operation count: without multiply-adds, the two cross products sum to 12 multiplies and 6 adds, scaling t by two takes 3 multiplies (or adds, your choice), and the final vector-by-scalar multiply and summation take 3 multiplies and 6 adds, respectively. That’s 18 multiplies and 12 adds total, or 15 multiplies and 15 adds if you did the doubling using addition.

Either way, that’s a significant reduction on both counts.

With multiply-adds, I count a total of 3 multiplies and 9 multiply-adds for the cross products (the first cross product has nothing to sum to, but the second does); either 3 multiplies or 3 adds (your choice again) for the doubling; and another 3 multiply-adds for the portion. That’s either 6 multiplies and 12 multiply-adds or 3 multiplies, 3 adds and 12 multiply-adds. Furthermore some GPUs can fold a doubling straight into the operand without counting as another operation at all; in that case we get 3 multiplies and 12 multiply-adds.

For comparison, multiplying a vector by a 3×3 rotation matrix takes 9 multiplies and 6 additions (or 3 multiplies plus 6 multiply-adds). So even though this is much better than we started out with, it’s generally still worthwhile to form that rotation matrix explicitly if you plan on transforming lots of vectors by the same quaternion, and aren’t worried about GPU vector register counts or similar.