Enhanced LiquidCrystal

quickly ballparking performance -- I'd added a couple of things since the previous benchmark to make println work properly so this isn't quite the same as what was benchmarked before:
8 data pins +RW 619
8 data pins - RW 650
4 data pins +RW 678
4 data pins -RW 734

So I'm thinking that pulling out all of the busy flag testing stuff makes a lot of sense; there are several hundred bytes of code and the performance advantage is less than 10% if I switch all the data pins to INPUT during the busy flag test and then back to OUTPUT when that is done.

It will probably be the end of the weekend before I do all of that and test it thoroughly; it'll be shorter than what I have now, slightly faster than the non-busy flag test above, still work with 40x4, fix linewrap, fix println, fix scroll preceding setCursor, fix the 16x4 thing.

I really have run an awful lot of characters through this thing without seeing an electrical problem but I certainly have to defer to the engineering guys; the logic diagram describing the arduino digital pins, pullup resistors etc is considerably beyond me.