is it possible to do FFT without using assembly code (at least directly)?
It is actually far more difficult to do FFT in assembly code. FFT in C and Fortran is far more common.
A fast FFT, 128 points and 8 bits, will take about 20 - 30ms on a 8Mhz / 8 avr.