Thank you very much for your answer. Results are better but not good enough.
It seems that it works when I use
spi_bytes = ( ( (byte_0 & B000111111) <<10) + (byte_1 <<2) + (byte_2 >>6));
instead of spi_bytes = ( ( (byte_0 & B00111111) <<10) + (byte_1 <<2) + (byte_2 >>6));
So when I put D15 to zero. But regarding the timing diagram I really do not understand why...
I will retry your solution with an oscio
Thank you