Applying Magnitudes¶

Every unit conversion factor is a magnitude — a nonzero real number. When we apply it to a value, conceptually, we’re just multiplying the value by that number. However, that doesn’t mean that multiplying by a number is the best implementation! Consider these examples.

Factor: $\frac{5}{8}$ ; value: 12 (type: int).
- Computationally, we don’t want to leave the integral domain. But of course, $\frac{5}{8}$ can’t be represented as an integer! This suggests we should perform two operations: first multiply by 5, and then divide by 8, yielding 7.
Factor: $\frac{1}{13}$ ; value: 91.0f (type: float).
- Conceptually, this has an exactly representable answer: $91\left(\frac{1}{13}\right) = 7$ . However, if we multiply by the single number (1.0f / 13.0f), we obtain the approximation 7.0000004768371582! This suggests that for $\frac{1}{13}$ , at least, it would be better to divide by 13.0f.

Au is thoughtful about how we apply conversion factors. We first compute a category for the factor, which dictates the best strategy for applying it. We may also take into account whether we’re dealing with an integral or floating point type.

Magnitude categories¶

We represent conversion factors with magnitudes. These representations support exact symbolic math for products and rational powers. They also support querying for numeric properties of the number, such as whether it is an integer, whether it’s irrational, and so on.

For purposes of applying to a value, we find four useful categories of magnitude.

Integers.
Reciprocal integers.
Rational numbers (other than the first two categories).
Irrational numbers.

These categories are mutually exclusive and exhaustive. Below, we’ll explain the best strategy for each one.

Integers¶

Applying an integer magnitude to a type T is simple: we multiply by that integer’s representation in T.

This always compiles to a single instruction, and always produces exact answers whenever they are representable in the type T.

Reciprocal integers¶

If a magnitude is not an integer, but its reciprocal is, then we divide by its reciprocal. For example, in converting a value from inches to feet, we will divide by $12$ , instead of multiplying by the representation of $\frac{1}{12}$ , which would be inexact.

As with integers, this always compiles to a single instruction, and always produces exact answers whenever they are representable in the type T.

Rational numbers¶

Again, to be clear, this category only includes rationals that are neither integers nor reciprocal integers. So, for example, neither $2$ nor $\frac{1}{5}$ falls in this category, but $\frac{2}{5}$ does.

This category is interesting, because it’s the first instance where our strategy depends on the type T to which we’re applying the factor. The best approach differs between integral and non-integral types.

Integral types¶

Here, we multiply by the numerator, then divide by the denominator. This compiles to two operations instead of one, but it’s the only way to get reasonable accuracy.

There’s another issue: the multiplication operation can overflow. This means we can produce wrong answers in some instances, even when the correct answer is representable in the type! For example, let’s say our value is std::numeric_limits<uint64_t>::max(), and we apply the magnitude $\frac{2}{3}$ : by the time we divide by $3$ , the multiplication by $2$ has already lost our value to overflow.

We might be tempted to prevent this by doing the division first. In the above example, this would certainly give us a much closer result! However, the cost would be reduced accuracy for smaller values, which are far more common. Consider applying $\frac{2}{3}$ to a smaller number, such as 5. The exact rational answer is $\frac{10}{3}$ , which truncates to 3. If we perform the multiplication first, this is what we get, but doing the division first would give 2.

If you know that your final answer is representable, and you have an integer type with more bits than your type T, then you can work around this issue manually by casting to the wider type, applying the magnitude, and casting back to T. However, if you don’t have a wider integer types, we know of no general “solution” that wouldn’t do more harm then good.

Floating point types¶

Applying a rational magnitude $\frac{N}{D}$ to a value of floating point type T presents a genuine tradeoff. On the one hand, we could take the same approach as for the integers, and perform two operations: multiplying by $N$ , then dividing by $D$ . On the other hand, we could simply multiply by the single number which best represents $\frac{N}{D}$ . Here’s a summary of the tradeoffs:

Criterion	Weighting	Multiply-and-divide: `(val * N) / D`	Single number: `val * (N / D)`
Instructions	medium	2	1
Overflow	low	More vulnerable	Less vulnerable
Exact answers for multiples of $D$	low	Guaranteed	Not guaranteed

Overall, we aren’t worried much about missing out on exact answers. Users of floating point know they need to handle the possibility that a calculation’s result can be one or two representable values away from the best possible result. (This is commonly called the “usual floating point error”.)

We also aren’t very worried about overflow. Even float has a range of $10^{38}$ , while going from femtometers to Astronomical Units (AU) spans a range of “only” about $10^{26}$ .

Going from 1 instruction to 2 is a moderate concern, which means that it outweighs the other two considerations. It represents a runtime penalty relative to the usual approach people take without a units library, which is to compute a single conversion factor. We always strive to avoid runtime penalties in units libraries! The reason we don’t consider this even more serious is that unit conversions should never occur in the “hot loop” for a program; thus, this performance hit isn’t really meaningful.

Outcome: we represent a rational conversion factor $\frac{N}{D}$ with a single number when applying it to a floating point variable.

Irrational numbers¶

There is no reason to try splitting an irrational number into parts to get an exact answer. Since we’re multiplying our variable by an irrational number, we know the result won’t be exactly representable. Therefore, we always simply multiply by the closest representation of this conversion factor.

The one difference is that we forbid this operation for integral types, because it makes no sense.

Summary and conclusion¶

Applying a conversion factor to a numeric variable of type T can be a tricky and subtle business. Au takes a thoughtful, tailored approach, which can be summarized as follows:

If the conversion factor multiplies — or divides — by an exact integer, then we do that.
Otherwise, if it’s a rational number $\frac{N}{D}$ , and T is integral, then we multiply by $N$ and divide by $D$ (each represented in T).
Otherwise, we simply multiply by the nearest representation of the conversion factor in T — with the exception that if T is integral, we raise a compiler error for irrational factors.