Is there a way to compile without assembly (eg to compile this on an ARM platform that doesn't support SSE2)?