In a recent article on reddit, I saw a prototype for a byte swapping function of the form double swap(double);.
As (nearly) everyone knows, different processor architectures can have different byte orderings. For example, x86 architecture processors use Little Endian byte order and PowerPC's support Big Endian byte ordering. Software that wants to be portable needs to take byte ordering into account. By convention, network data should be sent in Big Endian order. And in the C language, there are library functions that you can use to do the proper byte swapping. ntohs and ntohl take network data and convert them to host byte order (ntohs is read 'network-to-host-short'). Conversely, htons and htonl take host data and convert it to network byte order ('host-to-network-short'). On big endian systems, these functions don't actually have to do anything because the host byte order is already network byte order. On x86 machines the order of the bytes must be reversed. Typically you will swap bytes so you can poke the swapped data into buffers for sending on the network or writing to a file.
The standard library functions only have byte swapping functions for 16 bit and 32 bit integral types. 8 bit data doesn't need swapping. But sometimes you want to write floating point data to the network or a file. This is problematic in that different processor architectures may use different bit level representations of floating point data, but these days most machines use IEEE 754 implementations that are mostly compatible. Assume for this article we are not concerned with this level of compatibility. (But don't assume it for your application! if you are sending/storing doubles and floats, then it behooves you to understand the platforms you care about).
The C standard library doesn't have native functions for byte swapping 8 byte double, or for that matter 32 bit float data types. So almost everyone has to code up their own swapping routines one way or the other. It turns out that a naive implementation of byte swapping floats can lead to subtle errors, which is what this article is about.
Byte swapping with integers is usually accomplished one of three ways:
caveat: on most architectures, doubles and floats need to be properly aligned in memory, with the penalty for misalignment ranging from poor performance to core dump. When you are moving bytes around and then casting them to floating point types, be careful of alignment
Approach 1 doesn't work on floats and doubles because you cannot shift floating point values due to how they are represented in memory. Approach 2 and 3 are basically the same and produce the same type of results.
So suppose we implement a byte swapping function with the prototype:
/** swap the bytes of input argument 'a' and return the swapped data */ double swap(double a); */
If you see a byte swapping function with that signature (or float swap(float);) then you have to be very careful that it functions correctly and most likely it won't. Why not?
The problem lies with how a compiler will return a floating point variable, how IEEE 754 floating point representations work and what hardware floating point units are allowed (or required) to do in the face of certain data values. Floating point data is partitioned into a sign bit, an exponent and a mantissa. Consider these as 3 separate data elements that work together. What happens when you byte swap using the above prototype?
When this occurs, the receiver gets the data and swaps it back to the right byte order, he gets different bytes than what you started out with!
Here is an example program that will generate some of this type of error. The code is tagged as C++ but will compile as C also. All it does is iterate over some floating point values, swaps them to network byte order, swaps them back and prints the values. If the swap function works properly, it should not print anything.
source code for doubles (doesn't work)
#include "stdio.h"
#include "math.h"
// swap using char pointers
double swap(double d)
{
double a;
unsigned char *dst = (unsigned char *)&a;
unsigned char *src = (unsigned char *)&d;
dst[0] = src[7];
dst[1] = src[6];
dst[2] = src[5];
dst[3] = src[4];
dst[4] = src[3];
dst[5] = src[2];
dst[6] = src[1];
dst[7] = src[0];
return a;
}
int main(int argc,char *argv[])
{
double a;
double b;
double c;
for(a=0.0;a<100.0;a+=0.01) {
// swap to network byte order
b = swap(a);
// swap back
c = swap(b);
// now a and C should be EXACTLY the same. but if not, print something
if (a != c) {
printf("*****\n%21.18f\n%21.18f\n%21.18f\n%21.18f\n%llx\n%llx\n%llx\n",
a,b,c,fabs(a-c),*(unsigned long long *)&a,*(unsigned long long *)&b,*(unsigned long long *)&c);
}
}
return 0;
}
Run it and you should get a few instances where the resulting value after 2 swaps is not the same as the value before swapping. The output below is from Windows using Visual Studio. You get similar results using GCC on a Linux system. I am using printf for floats, so be aware that printf does rounding on the last digit, so you have to print enough digits to avoid having printf fool you.
Here is one output set from a value that failed the test
28.010000000001579000 // original value 1.#QNAN0000000000000 // swapped value used as a double is a NAN (Linux will just print 'nan') 28.010000000008855000 // unswapped value (different) 0.000000000007275958 // difference (big enough to worry about for floating point work) 403c028f5c28f77f // hex of original 7fff285c8f023c40 // intermediate swapped value (LSW f77f of original is changed to 7fff to make it a NAN) 403c028f5c28ff7f // hex of unswapped data. one bit is flipped
The difference may look small but it can be significant. If you are using doubles, then presumably you care about the precision. And the problem on floats is much worse because they don't have a lot of precision to begin with. If you change the above program to use float instead of double (just global replace 'double' with 'float'), you would get errors like this:
source code for swapping floats instead of doubles
89.800773620605469000 -1.#QNAN0000000000000 89.925773620605469000 0.125000000000000000 // difference, really bad for floats 42b399ff ffd9b342 42b3d9ff
So what do you do?
// swap but return as an integer unsigned long long int swap(double d); OR // swap into a char buffer. for convenience, return a pointer to that buffer unsigned char *swap(unsigned char buf[8],double d);
Now you need separate unswap functions that take the opposite parameters:
// unswap from an integral type to a double double unswap(unsigned long long int); OR // unswap from a buffer (note you can't overload on return type so you need a unique name) double unswapDouble(unsigned char buf[8]);
Here is a similar program with swap functions that work properly
source code with byte swapping that works
#include "stdio.h"
#include "math.h"
// swap using char pointers
unsigned long long swap(double d)
{
unsigned long long a;
unsigned char *dst = (unsigned char *)&a;
unsigned char *src = (unsigned char *)&d;
dst[0] = src[7];
dst[1] = src[6];
dst[2] = src[5];
dst[3] = src[4];
dst[4] = src[3];
dst[5] = src[2];
dst[6] = src[1];
dst[7] = src[0];
return a;
}
// unswap using char pointers
double unswap(unsigned long long a)
{
double d;
unsigned char *src = (unsigned char *)&a;
unsigned char *dst = (unsigned char *)&d;
dst[0] = src[7];
dst[1] = src[6];
dst[2] = src[5];
dst[3] = src[4];
dst[4] = src[3];
dst[5] = src[2];
dst[6] = src[1];
dst[7] = src[0];
return d;
}
int main(int argc,char *argv[])
{
double a;
unsigned long long b;
double c;
for(a=0.0;a<100.0;a+=0.01) {
// swap to network byte order
b = swap(a);
// swap back
c = unswap(b);
// now a and C should be EXACTLY the same. but if not, print something
if (a != c) {
printf("*****\n%21.18f\n%21.18f\n%21.18f\n%21.18f\n%llx\n%llx\n%llx\n",
a,b,c,fabs(a-c),*(unsigned long long *)&a,*(unsigned long long *)&b,*(unsigned long long *)&c);
}
}
return 0;
}
The moral of this story is that floating point data types aren't just big integers.