If you happen to have a Cortex-M4 with FPU, this is for you:
__attribute__( ( always_inline ) ) static __INLINE float __VSQRTF(float op1)
__ASM volatile (“vsqrt.f32 %0, %1″ : “=w” (result) : “w” (op1) );
should work with any recent cmsis. Much much faster than libm’s sqrtf.
The arm_sqrt_f32() from cmsis-dsp does NOT use the fpu !