Code generation improvements.
On i386, use SSE2 instead of only SSE [1] when compiler optimizations
are requested (RT code uses mostly SSE2 for SIMD).
Add a patch to re-enable nested OpenMP constructs with clang [1].
Use devel/openmp for libomp.
Use clang 3.9 to compile (not 3.7).
Reported by: Ingo Weyrich [1]