The best way to use SSE is to use the __m128 intrinsic directly. Unfortunately Visual Studio displays the values backwards (w,z,y,x). Bleh. Here is a change to autoexp.dat to correct the order.
First comment out this line:
__m128=$BUILTIN(M128)
And add this line:
__m128=<m128_f32[0]>, <m128_f32[1]>, <m128_f32[2]>, <m128_f32[3]>
Why not just defining your own, aligned to 16 byte boundary?
- NWW
This is how I declare the attributes in my Vector4 class:
union
{
struct { float x, y, z, w; };
__m128 vec;
};
In this way you can still access individual components using v.x, v.y, etc.
In the Domino math library I’m using the raw SIMD intrinsic as this yields the best performance. You can see this in DirectXMath and this article: http://www.gamasutra.com/view/feature/4248/designing_fast_crossplatform_simd_.php
On some compilers this is a true built-in, so you don’t get to roll your own SIMD value.