Quaternions
As I'm starting to put together the drone VR project, I've noticed that I've been remiss in explaining a facet of programming we're going to be depending upon; that while well understood among mathematicians, is viewed as too complex for most programmers. One of the many tasks that we often face in VR or game development is a need to know orientation. That is how to rotate objects so they appear oriented in a manner that we have planned.
Now the easiest way to handle rotations is to use Euler angles. That is to say imagine a 3d cartesian coordinate system centered at the center of the 3d object we want to rotate. The straight forward way of doing it, is to say rotate it around the each axis a certain number of degrees in 3 separate steps.
The above shows how a Gimbal works. They have the properties of being extremely easy to understand while also being susceptible to a very problematic detail. If one dimension rotates too close to another, then both dimension rotators become locked together. Once this "gimbal lock" occurs there is no solution that will let the gimbal exit out of alignment. They must be reoriented. Apollo 11 famously had a gimbal lock problem and there was worry that Apollo 13 would as well. The method NASA used for the Apollo missions was to prevent the space craft from rotating too close to poles by negating (or flipping) the gimbals. This was viewed as easier than adding a 4th gimbal which solves the solution as well.
Now looking at the left side, you can see the effect of gimbal lock, no matter how much you try to rotate once locked, a full dimension is gone and the rotation looks like a 2d instead of 3d rotation. As clever programmers, we can get around this problem by doing something similar to what NASA did for Apollo, we could detect when the rotation will be too close to the dreaded singleton poll value (or 180 degrees in any direction) and deny the rotation actually gets there. However, then we're going to want to be able to interpolate between two orientations so we can get the answers to how much the orientation changed since the last frame displayed on its rotating way to the destination. This is, to put it simply, tricky with Euler angles.
So not only can we not fully represent a 3d rotation in space without checking for gimbal lock, and fudging things a bit to avoid it, we also must accept that smoothly changing angles in a rotated object is now quite complex and certainly not a solution when we want 60 frames a second. Yet another problem with Euler angles is order matters. If you rotate an object about the X axis 45 degrees, then rotated about the Y axis 45 degrees it will be pointed and rotated completely differently than if you had first rotated about the Y axis 45 degrees and then about they X axis by 45 degrees.
This means that try as we might, there's not a really good time to be had with our attempts to use Euler angles no matter how simple the concept might be. Now what if I told you there not only is something better, but the only downside is our comfort zone of natural experiential understanding in the 3d world we live in must be challenged in order to fully grasp it. Everyone does immediately reach for how to get back to Euler angles prior to displaying the rotations, so all libraries have a call to transition to/from Euler and Quaternions but trust me, only do that when a human needs to read the angle or change the angle from a text prompt.
Why are quaternions so difficult? Well, remember I mentioned NASA could have solved the Apollo gimbal lock problem by adding a 4th gimbal? The solution is buried in the reason why a 4th independent rotating dimension is necessary. Rotations in 3d space involve 4 dimensions. To rotate an object, one must rotate it in the 4th dimension and then un-rotate by the same amount in the 4th dimension to get back to 3d orientation. This rotation involves complex numbers which if you remember from math is the idea of imaginary numbers with real numbers. But I get ahead of myself. First, what are imaginary numbers? Imaginary numbers are numbers that are used to solve impossible to solve equations. The best example of an imaginary number is `x^2 + 1 = 0` which if we try to solve for x gives us `x^2 = -1`. Yikes, there is no possible way to answer this as one property of squaring a number is the result is always positive.
Mathematicians don't like unsolvable problems, so the imaginary number concept was created. So we say `i^2 = -1` there isn't a solution to it, it's imaginary, but the existence of the imaginary concept gives us an ability to describe and solve problems that couldn't be solved otherwise. So a complex number `CC` is one that combines a Real number `RR` with an imaginary number `i`.
Now that we understand that `i^2 = -1`, we can see what happens if we raise it to other powers observe:
`i^0 =` | `1` |
`i^1 =` | `i` |
`i^2 =` | `-1` |
`i^3 =` | `-i` |
`i^4 =` | `1` |
`i^5 =` | `i` |
`i^6 =` | `-1` |
`i^7 =` | `-i` |
A pattern emerges! Raising `i` to new powers cycles through the sequence `(1, i, -1, -i, ...)`. If you think about rotating on a cartesian plane, this is exactly the same pattern that comes from rotating a point `90^circ` counter-clockwise or `(x, y, -x, -y)`. So hey, if we can multiply with powers and get `90^circ` rotations, what happens if we use multiplication with `sin` and `cos` functions? `q = cos theta + isin theta`.
Now we understand that complex numbers give us the ability to rotate in complex cartesian space on a 2d plane. What happens if we add a dimension to 3d space, we can do that by adding in two more imaginary numbers `j` and `k`. This gives us the ability to denote a 3d rotation as `q = s + xi + yj + zk` This is where `s`, `x`, `y`, and `z` are all in the set of real numbers `RR`. Guy by the name of Sir William Roland Hamilton in 1843 famously recognized that `i^2 = j^2 = k^2 = ijk = -1`. From his famous equation, which is easy to see from the properties of imaginary numbers is true; the set of quaternion numbers `QQ` was born.
Hamilton's equation also gives us the following:
`i*j=k` | `j*k=i` | `k*i=j` | ||
`j*i=-k` | `k*j=-i` | `i*k=-j` |
These relationships are very similar to the cross-product rules for unit cartesian vectors:
`x` x `y = z` | `y` x `z = x` | `z` x `x = y` | ||
`y` x `x = -z` | `z` x `y = -x` | `x` x `z = -y` |
So now we see that a quaternion consists of a scaler and a vector. That is to say, if you look at `q = s + x``i + yj +zk`, you can see that `s` is a scaler and `x``i, yj, zk` is a vector so most of the time we will notate a quaternion as `q = [s, v]`. Where `s` is the scaler and `v` is the vector portion of the quaternion.
Using this new notation, we can see Quaternions have similar properties to complex numbers. So remember I was saying to do a rotation in 3d space, we need to factor in the 4th dimension? Quaternions already have the ability to understand an imaginary dimension. However, if we were to take a Quaternion rotation all by itself by taking the Complex 2d generic rotator `q = cos theta * i sin theta` we'd naively have `q = [cos theta, sin theta v ]`. However, this doesn't take into consideration that the rotation also rotated into the imaginary direction. If you were to directly plot this, it wouldn't actually point the right direction. So what we must do is half way into the imaginary direction, we must rotate in the inverse imaginary direction. I know, this doesn't make sense. It's one of the hardest parts about understanding 3d rotations involve `q^-1` but it's necessary to go half way down what feels like right and then half way down the inverse to get to where you want to go.
This means that the general form of the rotation of a quaternion is:
`q = [cos (1/2 theta), sin (1/2 theta hat v) ]`.
Notice that this 1/2 angle means that the quaternions are telling us to turn all the way around, we'd have to go the full `720^circ`. So why does that make sense? Circles are `360^circ` around right? Well, they are also 2d objects. In 3d the rotation is "stored up" in the 4th dimension. Try this experiment, take a belt, twist it `360^circ` and then flip it by switching which hands hold which end. You will see the belt is still twisted. Straighten the belt and try twisting it `720^circ`. Now flip it again, and marvel at a straight belt. The reason why is the rotation got canceled out by the second rotation. This is called the Dirac Belt Trick.
I realize this post was really heavy on math, from my usual fare; however, quaternions are insanely useful. It is important to understand how they work as they are going to be used in my tutorials about how to simulate an arm rotating in space from the Daydream controller inputs. We will use this data to tell a drone where to go in the real world, so rotations are really going to matter to get right.
Thankfully, the Qualcomm math library is heavily optimized for Snapdragon processors and make heavy use of Quaternion math. The thing we're going to be very interested in with it is interpolation, or SLERP (Sphereical Linear intERPolation). The way that works is we take two Quaternions `qS` start and `qD` destination, then we find where there quaternion would be at time index `t`. The output is `qS` when `t` is 0 and `qD` when `t` is 1. The linear formula looks like this:
`q = qS + t(qD - qS)`
Note that this straight forward interpolation does have a few problems, if the dot product is negative, then the path that this chooses will appear to be the "long way" around (it has a previous rotation stored in the quaternion within the imaginary 4th dimension; so it has to travel all the way around the sphere to make it to where you want it to go). Solving this is simple by negating one of the quaternions if the dot product is negative.
Also, there is a chance that `qD` and `qS` are very close in orientation to each other and thus sin difference between the two will try to take `sin theta` very close to 0. If this happens, results are undefined as the formula involves trying to divide by `sin theta` and dividing by 0 is undefined. To solve for this, just linearly interpolate between `qD` and `qS` if the quaternions are too close to each other.
If you're still with me, I'll be posting a sample code that makes use of quaternions in a generic Daydream template. If you totally don't understand what I'm talking about here, you're welcome to take the template project and use it however you want, but when you're ready to understand how it works, please feel free to ask questions.
Comments
Post a Comment