New SdFat optimized for Mega and Teensy 3.0

pito,

I was able to try 16-bit frames. The write rate increased from 1776.44 KB/sec to 2013.34 KB/sec.

The overhead is increased since a byte swap is required. I form the 16-bit word to be sent like this:

    uint16_t w = *src++ << 8;
    w |= *src++;