If you are wanting a 2 pin design, you can get significantly better throughput using an and gate instead of an RC
network to multiplex the pins. The and gate can be created using a resistor and a diode.
When doing that, I was able to get 76us per byte transfer on a 16 mhz AVR.
And that 76 us is the full overhead from an Arduino sketch going through the Print class then down through the library
sending the byte to the display and returning back to the sketch.
The library is also using 4 bit mode which requires sending 2 nibbles and because of the and gate you have
clear the SR before each nibble transfer.
The code I used the SR2W code in this library:https://bitbucket.org/fmalpartida/new-liquidcrystal/wiki/Home
(I am the author of the SR2W interface code)
You can see a diagram of how the SR is wired up in LiquidCrystal_SR2W.h
Since you are a PIC guys here is a link to a page that is doing essentially the same thing
From an overall design perspective I think the AND gate multiplexor is better
than the RC network multiplexor even though it requires 4 times the number of shifts to the SR
since it is faster and allows backlight control.
It also has not timing critical sections so interrupts won't ever have to be masked
while doing the transfers.
Cost wise, it should be nearly the same resistor & diode vs resistor and cap.
Then for a few additional pennies you can add backlight control.
The backlight control does require some dampening on the transistor input to remove
the flicker because of live output bits on the SR during shifting.
You can see a sample backlight circuit in the SR2W header file.