Go Down

Topic: extra RAM on Arduino Due (Read 5001 times) previous topic - next topic

pito

#15
May 30, 2013, 04:07 pm Last Edit: May 30, 2013, 04:31 pm by pito Reason: 1
I built an external sram device with XC9536XL ($1) and a 4MBytes large SRAM chip. You need 11 wires to control the memory (8data+3bit control). With bigger cpld (more pins) you can access an "unlimited" size of sram. It has got an auto increment feature thus a rd/wr to the device automatically increments the sram's address.

The r/w speed with pic32@80MHz (bitbanging, no dma) and with larger blocks (ie. 512bytes) is ~6.5Mbytes/sec. It can be used in "16bit mode" (16data+3bit control wires required) with double speed.

Actually, the data width is not related to the cpld, so you may run it in 32b or 64b data width mode (4x or 8x faster) but you need more data wires, unfortunately..

More on:
http://retrobsd.org/viewtopic.php?f=5&t=991

ralphnev


I built an external sram device with XC9536XL ($1) and a 4MBytes large SRAM chip. You need 11 wires to control the memory (8data+3bit control). With bigger cpld (more pins) you can access an "unlimited" size of sram. It has got an auto increment feature thus a rd/wr to the device automatically increments the sram's address.
[snip]


i looked at a similar method but it requires to much processor intervention
- using SPI/USART with PDC/DMA should get up to 4MBytes/sec with very little processor intervention ...

my chunks are 4096bytes occuring once every 5ms (and faster if i can make other improvements)
so low processor over head is very important to me ..

gaith

Hi,
I don't understand all you're talking about, but it confirms that it's possible to add parallel SRAM, with high performances.

Pito, could you explain me something I don't understand (I already think about the library I'll have to write) :
You're telling in your post that by doubling the data size, it doubles the data transfer speed. But does the Arduino can read for example 32 pin-in state in the same time ? If I put 32 times the instruction DigitalRead(x) to read the DATA, I think it must take more CPU cycles than reading 8 inputs, and then slow the RAM access frequency ? Or is there a special function on Arduino which can read all the 32 input bits in the same time ?

Sorry for my questions, I'm totally noob in low-level micro controllers.
Build a groove box with Arduino Due :
http://groovuino.blogspot.com/

pito

#18
May 31, 2013, 11:14 am Last Edit: May 31, 2013, 11:53 am by pito Reason: 1
@gaith: some mcus can read/write a 16bit port with single instruction (ie. they have 16bit ports) or maybe 32bit ports as well. You have to investigate. I do not have DUE handy, but I would guess 32bit ARM or 32bit AVR can read/write 16/32bit data from/to its ports..

Quote
If I put 32 times the instruction DigitalRead(x)

With 8bit arduino you can read/write 8bit port in a single instruction, reading data with DigitalRead() is of course something I would never ever consider, indeed. Again - any mcu I know can read/write 8bit wide data from a port with single instruction.

This writes a byte to the port B:
Code: [Select]
 DDRB = 0xFF; //sets port B to output
 PORTB = addr;


This reads a byte from the port D:
Code: [Select]
 DDRD = 0x00; //sets port D to input
 data = PIND;


More reading: http://www.arduino.cc/en/Reference/PortManipulation

The pic32mx I referenced above has 16bit ports, so you can r/w the 8b or 16b port with single instruction. For a 32bit mcu it basically does not matter whether you read/write 8/16/32 bit - it is always (or mostly) a single instruction, because they always work with 32bit data internally..
Example:
Reading 8bits from the above disk device:
Code: [Select]

loop {
set /RD low
int8 data[i] = (PORTA & 0x00FF)
set /RD high
}


Reading 16bits from the above disk device:
Code: [Select]

loop {
set /RD low
int16 data[i] = PORTA
set /RD high
}


Writing 8bits to the above disk device:
Code: [Select]

loop {
set /WR low
PORTA = (int8 data[i]  & 0x00FF)
set /WR high
}


Writing 16bits to the above disk device:
Code: [Select]

loop {
set /WR low
PORTA = (int16 data[i] )
set /WR high
}


gaith

Great !
So I have to find how to access to the port registers on Due.
It seems that someone already tried to adapt an arduino Mega library which can do this, but after reading the posts... I don't know if it works or not. 
http://forum.arduino.cc/index.php?PHPSESSID=5mukpmk3fcgd6712quj4fnop57&topic=129868.0

Maybe it can be easier if I use directly some ARM assembly code.

I'll keep on investigate.

Thanks a lot for your explanations.
Build a groove box with Arduino Due :
http://groovuino.blogspot.com/

fat16lib

I understand you going to add RAM but I am curious why reading from the SD fails.

I looked at the Groovuino library and was astounded that only one file handle was uses and files are opened and closed while playing sound.

Opening a file is very slow so I would have used an array of file handles and opened all the files before playing sound.  A file handle only requires about 32 bytes.  Rewinding or seeking to the start of a file requires no SD access for an open file.

gaith

#21
May 31, 2013, 06:19 pm Last Edit: May 31, 2013, 06:44 pm by gaith Reason: 1
Yes, that's what I'm doing, and it works, I can play 4 files in the same time, but I want more !

In fact I have 2 classes : sampler.h and samplerl.h
The sampler.h open the file each time it's played. It is for using different samples for each pattern. So I'm unlimited by the number of wave files to use, but  it's not optimized. I can do only 2 voices polyphony.
The samplerl.h opens one file for each pattern, then uses seek function to go to the beginning. Better time access, but I can only use one wave file by pattern, and reach 4 voices polyphony.

What do you call the "file handle"? It's the SdFile object ?
Build a groove box with Arduino Due :
http://groovuino.blogspot.com/

fat16lib

The SdFile object acts like a file handle in other systems.  It contains information from the directory entry and cluster information for the current position.  A number of blocks must be read from file structures to open a file and seek to a position.

If I wanted to optimize reads from a large number of files I would use raw SD reads.

When you copy files to freshly formatted SD, the files are contiguous.  SdFat has a function to determine if a file is contiguous and where the blocks are located.

Quote

bool SdBaseFile::contiguousRange    (    uint32_t *     bgnBlock,
      uint32_t *     endBlock
   )       

Check for contiguous file and return its raw block range.

Parameters:
    [out]   bgnBlock   the first block address for the file.
    [out]   endBlock   the last block address for the file.

Returns:
    The value one, true, is returned for success and the value zero, false, is returned for failure. Reasons for failure include file is not contiguous, file has zero length or an I/O error occurred.


I would open each file and find its location with the the above function.

I would then use either the Sd2Card single block read function:
Code: [Select]
bool Sd2Card::readBlock (uint32_t block, uint8_t *dst);

Or the Sd2Card multi-block sequence:
Code: [Select]

bool Sd2Card::readStart (uint32_t blockNumber);  // set start block for a multiple block read sequence.

bool Sd2Card::readData (uint8_t *  dst);  // Read one data block in a multiple block read sequence.

bool Sd2Card::readStop ();  // End a read multiple blocks sequence.


SD cards do look ahead for multiple block reads so are very efficient in this mode.

gaith

Hi Fat16,
Thanks for the tip. It's been 2 days since I try to read the files as you say, but I must do something wrong.

Here is the code I use :

Code: [Select]
#include <arduino.h>
#include <SdFat.h>

SdFat sd;
Sd2Card *card = sd.card();

const int chipSelect = 10;
const int bufsize = 512;

const char* samplefile[]= {"kick1.wav", "hithat1.wav", "snare1.wav", "snare2.wav"};

uint8_t buf[bufsize];

SdFile myFile;

uint32_t bgnBlock;
uint32_t posBlock;
uint32_t endBlock;

void setup()
{
  Serial.begin(9600);   
  sd.begin(chipSelect, SPI_FULL_SPEED);
 
  myFile.open(samplefile[0], O_READ);

  posBlock = bgnBlock;

  //card->readBlock(posBlock,buf);
  card->readStart(posBlock);
  card->readData(buf);

  for(int i=0; i<10; i+=1)
  {
      Serial.print("block : ");
      Serial.println(posBlock);
      //card->readBlock(posBlock,buf);
      card->readData(buf);

      for(int j=0; j<255; j+=1)
      {
   tes[i] = ((int16_t)buf[1+2*j]<<8) + (int16_t)buf[2*j];
   Serial.println(tes[j]);
       }
       posBlock+=1;
       
     }
     card->readStop();
}


I've tried with both ReadBlock and ReadStart / Data / Stop
The first block is always ok, but after that, it seems that only the first byte of the buffer is filled... I don't understand nothing at all.
Can you see if I'm doing something wrong in my code ?

Thanks

@ Pito and Grumpy_Mike : I received my RAM and other components. It will be hard to solder the RAM as it's not DIP socket, but I will find a way. I keep you in touch.
Build a groove box with Arduino Due :
http://groovuino.blogspot.com/

Grumpy_Mike

Quote
It will be hard to solder the RAM as it's not DIP socket,

Sorry I though you knew that.
Look for an adapter board, the ones on ebay are often 10 times cheaper than those on Farnell.

fat16lib

#25
Jun 04, 2013, 04:27 pm Last Edit: Jun 04, 2013, 04:47 pm by fat16lib Reason: 1
Here is a sketch that will read a file using raw reads.

I tested it with about a five MB file on a 1 GB ATP industrial SD.

The result was a read speed of about 4.5 MB/sec on Due.

Quote

blocks: 9765
micros: 1094559
MB/sec: 4.57

Code: [Select]

#include <SdFat.h>
SdFat sd;
SdFile file;
static const uint8_t SD_CS = SS;
uint32_t bgnBlock;
uint32_t endBlock;
uint8_t buf[512];

void setup() {
 Serial.begin(9600);
 if (!sd.begin(SD_CS) || !file.open("TEST.WAV", O_READ)) {
     Serial.println("begin/open");
     while(1);
 }
 if (!file.contiguousRange(&bgnBlock, &endBlock)) {
   Serial.println("not contiguous");
   while(1);
 }
 // count of blocks in file;
 uint32_t n = (file.fileSize() + 511)/512;
 // read start time
 uint32_t t0 = micros();
 
 // address of first block
 sd.card()->readStart(bgnBlock);

 for (uint32_t i = 0; i < n; i++) {
   if (!sd.card()->readData(buf)) {
     Serial.println("readBlock");
     while(1);
   }
 }
 sd.card()->readStop();
 uint32_t t = micros() - t0;
 Serial.print("blocks: ");
 Serial.println(n);
 Serial.print("micros: ");
 Serial.println(t);
 Serial.print("MB/sec: ");
 Serial.println(512.0*n/t);
}
void loop() {}


Edit: I did some tests with four and eight block reads to get the time to read a chunk of a file.  These are using the industrial ATP card so will be faster than some consumer cards.
Quote

blocks: 4
micros: 574
MB/sec: 3.57

Quote

blocks: 8
micros: 1022
MB/sec: 4.01

So you can read a 2048 byte chunk in 574 usec and a 4096 byte chunk in 1022 usec.

gaith

#26
Jun 05, 2013, 11:28 am Last Edit: Jun 05, 2013, 01:17 pm by gaith Reason: 1
Thanks, it works.
With the myFile.read() function, I could load the wave data in any data type. With the readBlock function I get the wave data in a uint8_t[512] buffer.
What I need is a int16_t[1024] buffer (wave format is 16 bit signed integer, and I need 1024 samples).

I've tried to load a uint8_t[4][512] buffer (calling 4 times the readBlock), then make some computations to load it into the int16_t[1024] buffer, but it takes too much cpu time, and I have to instanciate 2 buffers instead of one. So the performances are lower then myFile.read() function.

My intuition tells me to use pointers, to directly load a uint8_t[2048] buffer, and read it as it was an int16_t[1024], but I didn't find the way to do it.
It's sure it has already been done (in SD wave players for exemple), but I didn't find anything on the subject.

Edit : I found that SdFatLib used the operator "reinterpret_cast", but I don't know if it can be used on a whole array.

Any idea ?

Thanks
Build a groove box with Arduino Due :
http://groovuino.blogspot.com/

fat16lib

#27
Jun 05, 2013, 04:51 pm Last Edit: Jun 05, 2013, 04:53 pm by fat16lib Reason: 1
Here is a function that will read a file chunk into any type destination.
Code: [Select]

#include <SdFat.h>
SdFat sd;
SdFile file;
const uint8_t SD_CS = SS;
uint32_t bgnBlock;
uint32_t endBlock;

uint16_t wave[1024];
//--------------------------------------------------------
bool readChunk(void* buf, uint32_t startBlock, uint16_t blockCount) {
 uint8_t* dst = (uint8_t*)buf;
 if (!sd.card()->readStart(startBlock)) return false;
 for (uint16_t i = 0; i < blockCount; i++) {
   if (!sd.card()->readData(dst + i*512L)) return false;
 }
 return sd.card()->readStop();
}
//---------------------------------------------------------
void setup() {
 Serial.begin(9600);
 if (!sd.begin(SD_CS) || !file.open("TEST.WAV", O_READ)) {
     Serial.println("begin/open");
     while(1);
 }
 if (!file.contiguousRange(&bgnBlock, &endBlock)) {
   Serial.println("not contiguous");
   while(1);
 }
 uint16_t n = 4;  
 uint32_t t0 = micros();

 if (!readChunk(wave, bgnBlock, n)) {
   Serial.println("readChunk");
   while(1);
 }
 uint32_t t = micros() - t0;
 Serial.print("blocks: ");
 Serial.println(n);
 Serial.print("micros: ");
 Serial.println(t);
 Serial.print("MB/sec: ");
 Serial.println(512.0*n/t);
}
void loop() {}

Here is timing for reading a 1024 element array of uint16_t.
Quote

blocks: 4
micros: 576
MB/sec: 3.56

gaith

Great great thanks !
Now I can manage more than 6 voices of polyphony. I couldn't even reach the limits. This Due is very surprising !
Your function is really faster than the read() function.

I will update my library with this code.

I don't need RAM anymore, but as I received it, I will  make the experiments anyway.

Thanks again to all.
Build a groove box with Arduino Due :
http://groovuino.blogspot.com/

Go Up