amd64: depessimize userspace memcpy/memmove/bcopy
The change resembles what was done in r334537 for kernel routines.
While here take care of i386 variants. Note that primitives remain
suboptimal.
Reviewed by: kib (previous version)
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17167