Maple-DMA is not very interesting. It polls using single byte SPI reads while the card is busy then sets up DMA and polls while DMA runs. This does not save any CPU cycles or reduce the write latency. In fact there is a huge amount of unnecessary code due to the fact that this is an old version of SdFat.
Using the Freescale Kinetis SPI fifo should be as fast or faster. A loop to deliver 3 MB/sec to the SPI fifo for 24 MHz SPI should be easy with the Freescale Kinetis.
Improving use of cache and multi-block SD commands will produce a much larger gain than DMA. I can already get about twice the performance of the Maple-DMA version of SdFat on a AVR Mega with these improvements. I hope to get about another factor of three on Kinetis.
DMA could be useful with a multi-threaded RTOS. You could then build a true driver thread with the driver sleeping while DMA ran.
I plan to port ChibiOS and FreeRTOS to Cortex boards. ChibiOS and FreeRTOS already runs on Maple. I plan to integrate the RTOS kernels as a library so you can use the IDE core functions and other libraries in multi-thread mode.
Unfortunately my Teensy 3.0 seems lost in the mail so I can't start testing. I ordered two more from Adafruit so something should arrive soon.
LOL, I'm just now seeing this thread.... been far too busy shipping the rewards and writing software!
I was wondering if anyone would ever notice those C++ classes?! I did some looking at libraries that use SPI, and sadly most of them directly access the AVR registers. The official SPI library arrived relatively late in the development of Arduino, so it hasn't been widely adopted. It also changed its own API at least once, causing at least one library author to dump it and go directly to the registers. The existing SPI library isn't much of an abstraction (eg, able to support the fifo, dma, or automatic chip select signals). Fortunately, the compiler optimizes away pretty much all of the C++ stuff because it's inline functions. The SPCR part isn't highly efficient, but the data register and status flag compile to the equivalent registers accesses. It was pretty painful having to clear the fifo every time SPDR is written, but that's necessary to faithfully emulate the AVR registers.....
For your sdfat library, at least making good use of the fifo should be much faster. Would you prefer to put the Freescale registers directly into your sdfat library, or work with the a new SPI library that supports the fifos and other features (and might be adaptable to other new chips with similar SPI features)?
Paul, I am doing a major redesign of SdFat to use better caching and faster SD commands so large writes/reads will be much faster.
I plan to support SPI and 4-bit SDIO on various Cortex M chips. I also want to make SdFat RTOS friendly when using DMA.
I would love to have a better low level SPI library for each chip.
I need a way to restore the SPI speed and mode each time I access the SD. I need single byte read and write for sending commands, receiving status, and polling for busy.
I need fast block read and write routines. These could use a fifo or DMA.
I am ready to start testing with some prototype SPI functions I have done but for some reason my Teensy 3.0 has not arrived in California yet.
Edit: I need the equivalent of these AVR functions.
//------------------------------------------------------------------------------
/**
* Initialize hardware SPI
* Set SCK rate to F_CPU/pow(2, 1 + spiRate) for spiRate [0,6]
*/
static void spiInit(uint8_t spiRate) {
// See avr processor documentation
SPCR = (1 << SPE) | (1 << MSTR) | (spiRate >> 1);
SPSR = spiRate & 1 || spiRate == 6 ? 0 : 1 << SPI2X;
}
//------------------------------------------------------------------------------
/** SPI receive a byte */
static uint8_t spiRec() {
SPDR = 0XFF;
while (!(SPSR & (1 << SPIF)));
return SPDR;
}
//------------------------------------------------------------------------------
/** SPI read data - only one call so force inline */
static inline __attribute__((always_inline))
void spiRead(uint8_t* buf, uint16_t nbyte) {
if (nbyte-- == 0) return;
SPDR = 0XFF;
for (uint16_t i = 0; i < nbyte; i++) {
while (!(SPSR & (1 << SPIF)));
uint8_t b = SPDR;
SPDR = 0XFF;
buf[i] = b;
}
while (!(SPSR & (1 << SPIF)));
buf[nbyte] = SPDR;
}
//------------------------------------------------------------------------------
/** SPI send a byte */
static void spiSend(uint8_t b) {
SPDR = b;
while (!(SPSR & (1 << SPIF)));
}
//------------------------------------------------------------------------------
/** SPI send block - only one call so force inline */
static inline __attribute__((always_inline))
void spiSendBlock(uint8_t token, const uint8_t* buf) {
SPDR = token;
for (uint16_t i = 0; i < 512; i++) {
uint8_t b = buf[i];
while (!(SPSR & (1 << SPIF)));
SPDR = b;
}
while (!(SPSR & (1 << SPIF)));
}
fat16lib:
Edit: I need the equivalent of these AVR functions.
I can do those, using the fifo for good speed.
Is there a version of your library which already uses these? Or some test code that calls them to do something simple, like read and print the MBR or Volume ID sector?
I need some sort of test code that I can compile and run.
It also has the following function to initialize AVR SPI pins:
/**
* initialize SPI pins
*/
static void spiBegin() {
pinMode(MISO, INPUT);
pinMode(MOSI, OUTPUT);
pinMode(SCK, OUTPUT);
// SS must be in output mode even it is not chip select
pinMode(SS, OUTPUT);
// set SS high - may be chip select for another SPI device
#if SET_SPI_SS_HIGH
digitalWrite(SS, HIGH);
#endif // SET_SPI_SS_HIGH
}
This version of SdFat does not have the new stuff to speed up large reads and writes. That involves changes to use multi-block SD commands.
One minor but important point for compiling on 32 bit platforms is the use of packed structs. By default, the compiler will align 32 bit types to 4 byte boundaries on 32 bit processors. That's definitely not what you want in SdFatStructs.h. It's necessary to add "attribute((packed))" to each struct definition, so the compiler packs the struct as intended.
For example.
struct masterBootRecord {
/** Code Area for master boot program. */
uint8_t codeArea[440];
/** Optional Windows NT disk signature. May contain boot code. */
uint32_t diskSignature;
/** Usually zero but may be more boot code. */
uint16_t usuallyZero;
/** Partition tables. */
part_t part[4];
/** First MBR signature byte. Must be 0X55 */
uint8_t mbrSig0;
/** Second MBR signature byte. Must be 0XAA */
uint8_t mbrSig1;
} __attribute__((packed));
Well given kickstarter has now changed the rules, and Teensy 3.0 as it was funded would not have been allowed under the new rules. So it may not make any difference whether you tell them or not. Paul himself has spoken out against the new rules, and I imagine hardware projects like the Teensy will just go elsewhere.
Kickstarter can go back to being a tip jar to fund art projects, as it evidently wants to do so. As I said in their blog, I wish them a good life, but I have stopped looking at KS for interesting tech projects to fund. The only things I do on KS now is to check on updates to the 4 tech products I recently backed (RadioBlock, Teensy, Digispark, and JumpShot) as well as the one non-tech product (Deck of Extraordinary Voyages)
I believe the recent rule change was motived mainly to protect Kickstarter from liability, rather than protecting backers from failed projects. I haven't seen anything really conclusive about the Hanfree lawsuit, but the plaintiff publicly said he asked the court to rule the nature of the transaction was an ordinary sale. I'm an engineer, not an attorney, so I'm not going to speculate what that might mean?
It could also be purely coincidental that the "not a store" rule change came about just as the Hanfree lawsuit was making it's way through the legal system.
Here's the failed Hanfree project, where you can read all the ugly details in the comments.
Just wanted to let others know of my progress so far on using my new Teensy 3.0 board. I had to order a micro USB cable and it arrived yesterday. I had previously loaded the modified IDE and the Teensy driver thingee, so I just attached the Teensey to the cable and plugged it in. PC seemed to be happy with the attachment and a led on the Teensey was blinking away so I assumed they ship it with the blink sketch loaded. I opened the IDE, selected the Teensey 3.0 board, loaded the mimumSketch example and it upload. I was a little surprised when the Teesey loader pop window sprung up, as I had no idea how the Teensy works with the arduino IDE, but the loader has a option to follow a scrip log and it seemed to all be working correctly, even though in the IDE results window is says something about compile size is zero bytes, but the Teensy loader pop-up log shows all the correct size info and a lot of other stuff. Anyway the Teensy board did stop blinking it's led, so everything seemed to upload and run OK. I then loaded blink sketch example in the IDE and hit upload and everything worked again and the board did indeed start blinking it's led again.
So I guess the report is that the Teensey 3.0 seems to work right out of the box as designed even for this software-installing-challenged kind of guy that I am. I still haven't a clue what I might do with this board yet. And Paul seems to be releasing a new IDE version every other day to add some new arduino library update, so it seems kind of silly to rush into anything. But it's a great little product with a lot of promise ahead for it I think. I kind of hope a Teensey forum might start up to help support this product, if one is not already around somewhere?
I'm still getting over the shock of how....well teensey this thing is, so small.
@Lefty - glad it's working. I fixed the size reporting in beta4. This evening I'm going to publish beta5, with a master-only port of Wire (slave mode to be filled in next week), and Serial1, Serial2, and Serial3 working.
@fat16lib - I applied the patch. Now the code compiles. It's not quite working, but that's probably a bug on my end. I'm investigating now. Will try to get those 24 Mbit/sec optimized routines written for you.
I'm still getting over the shock of how....well teensy this thing is, so small.
I know how you feel. When I got my first teensy in hands I was also going like "this small"? And indeed it all worked great out of the box and the blink sample is preloaded (This is part of Paul testing the boards).
I would really advice Arduino to look at what Paul is doing (more often). For instance: Even though the teensy 2 uses the same chip as the Leonardo it does not have the 2 com ports issue which is very distracting. The due has 2 com ports. I guess if they had used halfkay (the window that popped up during upload) like Teensy they would not need 2 com ports.
My Teensy is on its way. I got a friend to order from Ireland where it has arrived, looking forward to it finishing the final leg of its journey to Dubai next week.
I've been working with the SdFat beta for the last hour, digging into why it's not working. Turns out attribute((packed)) bit me yet again. Forgot I've started from a clean copy to apply the patch.
I mailed you a version with mods for faster reads/writes. It compiles for Teensy 3.0 but is not tested on Teensy 3.0 since I have not received the replacement for the missing Teensy 3.0 that Robin sent yesterday.
The results for a AVR Mega with 4096 byte writes and reads are promising.
Type any character to start
Free RAM: 2666
Type is FAT16
File size 5MB
Buffer size 4096 bytes
Starting write test. Please wait up to a minute
Write 536.40 KB/sec
Maximum latency: 10336 usec, Minimum Latency: 6908 usec, Avg Latency: 7592 usec
Starting read test. Please wait up to a minute
Read 595.04 KB/sec
Maximum latency: 7984 usec, Minimum Latency: 6804 usec, Avg Latency: 6877 usec