Compiling with and without vectorization

We can compile and link without vectorization

c++ -o novec.x vecexample.cpp

and with vectorization (and additional optimizations)

c++ -O3 -o  vec.x vecexample.cpp 

The speedup depends on the size of the vectors. In the example here we have run with \( 10^7 \) elements. The example here was run on a PC with ubuntu 14.04 as operating system and an Intel i7-4790 CPU running at 3.60 GHz.

Compphys:~ hjensen$ ./vec.x 10000000
Time used  for vector addition = 0.0100000
Compphys:~ hjensen$ ./novec.x 10000000
Time used  for vector addition = 0.03000000000

This particular C++ compiler speeds up the above loop operations with a factor of 3. Performing the same operations for \( 10^8 \) elements results only in a factor \( 1.4 \). The result will however vary from compiler to compiler. In general however, with optimization flags like \( -O3 \) or \( -Ofast \), we gain a considerable speedup if our code can be vectorized. Many of these operations can be done automatically by your compiler. These automatic or near automatic compiler techniques improve performance considerably.