Curiously Recurring Vector Math

During the development of my first toy game engine, I wanted to try writing my own maths library. Nothing too crazy, but just enough for some simple rendering and gameplay programming. I ended up with vec2/3/4 and mat4, a minimal implementation for what I needed at the time.

Each of the classes for those types were written separately, with all operators, constructors, shorthands, and everything else full laid out for each type. This resulted in a whole load of duplication and became a maintenance nightmare for younger me. Although having everything fully written out is more efficient for non-optimised builds than a general solution, I still couldn't help wondering if there was a nice way to express vector and matrix classes in a general way that matched the functionality of a verbose implementation.

Full vector and matrix implementations are available on GitHub.

Basic vector implementation

A vector is basically just a pretty container for an array. It largely just wants to implement operators on a contiguous sequence of members. For instance, when you want to multiply a vec2:

template<typename T>
struct t_vec2 {
    T x;
    T y;

    ...

    t_vec2& operator*=(T in) {
        x *= in;
        y *= in;
        return *this;
    }
};

The vec3 is the same with a member z, and the vec4 the same as that with a member w. Each operator and function gets written out individually for each type, meaning a few hundred lines of essentially the same thing across all vectors. The generic alternative to this is passing through a template parameter for the length of the vector, and storing the data as an array:

template<typename T, size_t len>
struct t_vec {
    T data[in] = { 0 };

    ...

    t_vec& operator*=(T in) {
        for(size_t i = 0; i < len; ++i)
            data[i] *= in;
        return *this;
    }
};

And then you can declare your other types from that:

struct vec2f : t_vec<float, 2> {};
struct vec3f : t_vec<float, 3> {};

It's necessary to inherit to a new type instead of just using typedef on the t_vec because you're going to want to add specific functions per type later on, such as magnitudes, dot products, and specialised copy constructors.

This implementation works out until you get to more complex expressions like this:

vec2f v1, v2, v3;
v1 + v2;      /* Fine */
v1 += v2;     /* Fine */
v3 = v1 + v2; /* Not fine */

Because the base class t_vec is using and returning the type t_vec for everything, the compiler needs to be able to deduce types at compile time. For things like shorthand operators, it can handle this without a problem, but things get more complex when combining expressions like assigning the result of an addition because you're combining types of vec2f and t_vec<float, 2>). In other words, we need to be more specific about our types when declaring these functions, because working with the base class type won't cut it.

Curiously vector

The Curiously Recurring Template Pattern (CRTP) is an idiom where when a derived class inherits a base class, it passes itself as a template parameter to that base class. There are lots of good articles on the benefits and caveats of CRTP, but here's a visualisation of that summary:

template<typename Derived>
struct Base {};

struct Foo : Base<Foo> {};

This is often used as a way of enforcing a derived class layout, and offsetting the costs of polymorphism (the dynamic dispatch) from runtime to compile time, though I'm not going to get into it here (the linked article explains everything quite well). For our vector case, just having extra knowledge of our final vector types in our base class is what we want. With the CRTP idiom, we can make all of our shared functions explicitly use the derived type. This means when we have expressions like v3 = v1 + v2, there is a complete consistency between the types we've declared and the types the operators are using.

Putting all of that together looks like this:

template<typename T, size_t len, typename t_vecx>
struct t_vec {
    T data[len] = { 0 };

    ...

    t_vecx& operator+=(const t_vecx &other) {
        for(size_t i = 0; i < len; ++i)
            data[i] += other.data[i];
        return static_cast<t_vecx&>(*this);
    }
}

template<typename T>
struct t_vec2 : t_vec<T, 2, vec2> {};

typedef vec2f t_vec2<float>;

The t_vecx gets compiled to the type we've declared - a t_vec2, t_vec3 or whatever else. By taking in and returning this type rather than the base class, we have complete type consistency across all functions. That previously pesky combination of assignment and addition is no longer a problem!

Extending to new types becomes easy. A vec3 declared in the same way inherits the almost all of the operators overloads they want. Types still need to define a few things themselves (you probably want a vec2 construction that takes in two values, for example), but when declaring vec2/3/4, we're able to use the same code three times over for the majority of what our vectors are doing. When creating matrices (which can use these different vector types in varying combinations for their rows and columns), the savings in what has to be written for the new types is even more significant.

Member access

The above examples use an internal data array, which we can hook up to operator[] for direct member access. But sometimes it's more expressive to refer to members by x/y/z rather than [0]/[1]/[2].

My first attempt at this was conditional compilation of x()/y()/z()/w() member functions using the len template parameter.

template<typename E = T>
typename std::enable_if<(len > 0), E&::type x() {
    return data[0];
}

This is SFINAE - "substitution failure is not an error". This conditional function checks if length is greater than 0, and if so allows the compilation of function x(), which returns the first element of our array. Otherwise, the function is omitted without causing any compilation problems. std::enable_if comes from the <type_traits> header, but you can circumvent this with your own four line implementation.

Whilst functional, it ... kind of sucks. All the cool maths libraries allow direct member access, and also allow you to provide more context through different member names. For example, glm declares its members through unions. vec3 members are laid out like this:

union { T x, r; };
union { T y, g; };
union { T z, b; };

Then you have the choice of using the xyz convention for coordinates and the rgb convention for colours, and that extra bit of expression is desirable to have available. And it's direct member access, rather than a conditionally compiled function. In our class implementation, this can be done with another template parameter in which we pass through the members.

First, we define a class to provide access to its internal members. A default implementation is useful to pass through so you have the choice of not making something more specific later on. We have the member implementation define the operator[] access, and remove that from the t_vec class.

template<typename T, size_t len>
struct t_vec_members {
    T data[len] = { 0 };

    T& operator[](size_t i) {
        return data[i];
    }

    const T& operator[](size_t i) const {
        return data[i];
    }
};

We used modify things things internally by accessing the data array directly, but since we now won't always have a data member, we'll need to provide some other access. In t_vec, we create a wrapper around the operator[i] declared in the member struct. The t_vec struct now looks like this:

template<typename T, size_t len, typename t_vecx, typename members = t_vec_members<T, len>>
struct t_vec : members {

    T& data(size_t i) {
        return (*this)[i];
    }

    const T& data(size_t i) const {
        return (*this)[i];
    }

    ...
}

One find and replace later to change all of our data[i] accesses to data(i) and we're done. I would suggest keeping [] access through a macro, but wouldn't want to upset anyone. The implementation can now hook into any other members we provide, so long as they implement operator[].

Creating members for, and then declaring, a vec3 now looks like this:

template<typename T>
struct t_vec3_members {
    union { T x, r; };
    union { T y, g; };
    union { T z, b; };

    T& operator[](size_t i) {
        return (&x)[i];
    }

    const T& operator[](size_t i) const {
        return (&x)[i];
    }
};

template<typename T>
struct t_vec3 : t_vec<T, 3, t_vec3<T>, t_vec3_members<T>> {};

And just like that, we can now create a vec3 and call myvec3.x to read or write to the first value without any other changes to our main vec.

Closing notes

The provided vector class shows a way to a generic implementation for vectors of multiple lengths with ease. The derived classes have the option of implementing nothing themselves, but can optionally provide their own constructors, functions, and member access. In a few hundred lines, we are able to declare vectors of any length, and matrices of any dimensions.

Through some very rudimentary benchmarking, there's unquestionably a runtime cost to accessing everything through functions and operators internally. Running some tests going through operators and assignment, this implementation was approximately five times slower than glm and my handwritten vectors. This cost is phased out at -O2 and beyond, but I didn't examine the assembly to see if everything compiles down to be identical.

It's a shame that the unoptimised build is significantly slower than the alternatives since that eliminates any desire to use it for any game programming. It's nice to know that the implementation is functional, and hanging around as a tiny header that could slot in somewhere in the future.