Can I do this? LCD+Arduino = fast moving display?

You may not need those assembler nops. The arduino digitalWrite commands take longer without any additional delay then the Enable pulse width (450ns ) and Enable cycle time (1000us) specified in the HD44780 datasheet.

BTW, If you want the fastest performance then you should check the LCD busy flag to see when its ready instead of using delays before writing data.