The effects of compilation, especially with flags like -O3, may complicate how effective one's techniques to parallelize end up being, because the compiler already is making its own optimizations and trying to prematurely optimize in code may in fact hinder the efforts of the compiler. The optimal strategy is to 'expose' as much parallelism as possible, like vectorizing instructions when possible, so that when the compiler attempts it own optimizations, it only helps efforts to optimize, not hinders them.
The effects of compilation, especially with flags like -O3, may complicate how effective one's techniques to parallelize end up being, because the compiler already is making its own optimizations and trying to prematurely optimize in code may in fact hinder the efforts of the compiler. The optimal strategy is to 'expose' as much parallelism as possible, like vectorizing instructions when possible, so that when the compiler attempts it own optimizations, it only helps efforts to optimize, not hinders them.