8MB Ramdisk (external RAM) for Arduino..

Hi, I've been toying with these modules for a while. Not designed as a arduino shield, but maybe an interesting gadget for arduino folk as well. It is an 8Mx8 ramdisk (external ram), based on a single 8MB 70ns psram and Xilinx cpld as the controller.
You need 11 wires to write/read it (8 data and 3 controls), it includes an address auto-in(de)crement feature. Easy to bit-bang.

. Zero write/read latencies and unlimited endurance
. 11 signals only
. Data bus width 4bit .. 8bit (min 4)
. Up to 12MB/sec Read and 15MB/sec Write throughput
. Supports standard memory bus signaling (PMP, EMB, FSMC, atc.)
. Byte access as well as block access with Address auto-increment or auto-decrement
. Fast Address reset, WR protect
. More modules in parallel with /MS (Module Select)
. For 3V3 platforms (3V3 328p, 1284p, Arduino Zero)
. Ideal for DUE and its External Memory Bus (as an 8bit static memory device)
. Proven as the Ramdisk for swap and filesystem (Fubarino w/ pic32MX@120MHz) at retrobsd.org
. Provided in form of a plug and play component, 2.54mm header pitch for easy breadboarding

:open_mouth:

That module would be interesting to rewrite the SD driver for. Also with some hardware the interface could be SPI and I guess that 16 MHz would not be a problem. With this size of device a file system is more or less necessary. With a SD driver interface the FAT16 library could be used (though a "format" function will be needed).

Do you have any numbers? What kind of access time? Byte/Block etc.

This would be great for very fast data loggers.

Cheers!

Do you have any numbers? What kind of access time? Byte/Block etc.

You may try to toy with the below sketch (you do not need the module for running it) in order to judge on the numbers. You may see how the "driver" works as well.
Note: the results are for 'raw' access times only (loops exclusive) except the block rd/wr test. You may elaborate the code to fit your actual purposes and to see how it will perform, the numbers might be lower, however..
Note1: A feature with atmega is an read from a port (ie. the input read by "data = PINC") - it is delayed because of the input "synchroniser" (see the ref manual).

Update:

// RAMDISK v.1.1 - 8Mbytes - testing with arduino compatible platforms
// RAMDISK module requires 3.3Volt signal levels and power supply!
// Tested with 3.3V atmega1284p @16MHz and bit-banging
// UPDATE: March 31 2014
// Provided as-is, no warranties of any kind
// c 2014 by Pito
/*
RAMDISK MODULE PINS:

           |-----------------------------
DATA0    --- D0                       L  |
DATA1    --- D1                     @ E  |
DATA2    --- D2                       D  |
DATA3    --- D3                          |
Ground   --- GND                         |
DATA4    --- D4     RAMDISK v1.1         |
DATA5    --- D5                          |
DATA6    --- D6                          |
DATA7    --- D7                          |
NMS      --- /MS Module Select (optional)|
NRD      --- /RD Read signal, 85ns min   |
NWR      --- /WR Write signal, 48ns min  |
NDATA    --- /DATA  (A0 with PMP bus)    |
3V3      --- 3.3Volts power              |
           |-----------------------------

LED - indicates rd/wr memory accesses

*/
/*
 * NRD - read signal active low, min 85ns, mind the delayed input on atmega mcu
 * NWR - write signal active low, min 48ns, max 8us
 * NDATA - access to sram data - low, access to controller  - high
 * NMS - module select - active low (must be hardwired to low with single module)
 */

#include <DigitalIO.h>

// HW interface to the RAMDISK module (an example only, do assign yours control pins):
DigitalPin<12> NRD(OUTPUT);     //    /RD active LOW
DigitalPin<13> NWR(OUTPUT);     //    /WR active LOW
DigitalPin<14> NDATA(OUTPUT);   //    /Data active LOW
// PORTC is the Data bus D0-D7 (an example only)

typedef union {
	unsigned long value;
	struct {
		unsigned char nib1: 4;  // lowest nibble
		unsigned char nib2: 4;
		unsigned char nib3: 4;
		unsigned char nib4: 4;
		unsigned char nib5: 4;
		unsigned char nib6: 4;
		unsigned char nib7: 4;
		unsigned char nib8: 4;  // highest nibble
	};
} 
nybbles ;

typedef union {
	unsigned long value;
	struct {
		unsigned char byte1: 8;  // lowest bytes
		unsigned char byte2: 8;
		unsigned char byte3: 8;
		unsigned char byte4: 8;  // highest byte
	};
} 
bytes ;

void setup()
{
	// init the RAMDISK
	ramdisk_init ();

	Serial.begin(115200);
	Serial.println("START");
}

// init the RAMDISK
inline void ramdisk_init (void){
	delay(1);
	NRD = 1;  // Idle
	NWR = 1;  // Idle
	NDATA = 0;  // Idle = data mode
	DDRC = 0x00; // sets DATA PORT as the input
	// make a dummy read
	NRD = 0; 
	NRD = 1;
}

// load the 24bit address into
inline void loadadr (unsigned long addr)
{
	nybbles temp;
	temp.value = addr;

	NDATA = 1;  // address mode	
	DDRC = 0xFF; // sets Data output

		PORTC = (temp.nib6); // 6th nibble of the address - the highest
		NWR = 0; NWR = 1;    // write the nibble into the RAMDISK address counter
		PORTC = (temp.nib5); // 5th nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib4); // 4th nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib3); // 3rd nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib2); // 2nd nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib1); // 1st nibble of the address - the lowest
		NWR = 0; NWR = 1;

	NDATA = 0;  // data mode
	DDRC = 0x00; // sets Data input
}

// random read - reads a byte from the addr
inline unsigned char rd_data(unsigned long addr)
{
	unsigned char d;
	loadadr(addr);
	NRD = 0; NRD = 0; NRD = 1; d = PINC;  // atmega reads the port ahead!
	return d;
}

// random write - writes a byte at the addr
inline void wr_data(unsigned long addr, unsigned char data)
{
	loadadr(addr);
	DDRC = 0xFF; // sets Data output
	PORTC = data; NWR = 0; NWR = 1; 
	DDRC = 0x00; // sets Data inp
}


// for block read - reads a byte (w/ addr autoincrement)
inline unsigned char rd_databk()
{
	unsigned char d;
	NRD = 0; NRD = 0; NRD = 1; d = PINC;  // atmega reads the port ahead !
	return d;
}

// for block write - writes a byte (w/ addr autoincrement)
inline void wr_databk(unsigned char data)
{
	PORTC = data; NWR = 0; NWR = 1; 
}

// random read - reads a long from the addr
inline unsigned long rd_datalong(unsigned long addr)
{
	bytes temp;
	loadadr(addr);
	DDRC = 0x00; // sets Data inp
	NRD = 0; NRD = 0; NRD = 1; temp.byte1 = PINC; 
	NRD = 0; NRD = 0; NRD = 1; temp.byte2 = PINC;
	NRD = 0; NRD = 0; NRD = 1; temp.byte3 = PINC;
	NRD = 0; NRD = 0; NRD = 1; temp.byte4 = PINC;
	return temp.value;
}

// random write - writes a long at the addr
inline void wr_datalong(unsigned long addr, unsigned long data)
{
	bytes temp;
	temp.value = data;
	loadadr(addr);
	DDRC = 0xFF; // sets Data output
	PORTC = temp.byte1; NWR = 0; NWR = 1;
	PORTC = temp.byte2; NWR = 0; NWR = 1;
	PORTC = temp.byte3; NWR = 0; NWR = 1;
	PORTC = temp.byte4; NWR = 0; NWR = 1; 
	DDRC = 0x00; // sets Data input
}

// block write - writes nbytes of data from addr
inline void wr_block(unsigned long addr, unsigned long nbytes, unsigned char *data){
	loadadr(addr);
	DDRC = 0xFF; // sets Data output
	while (nbytes--) wr_databk(*data++);		
	DDRC = 0x00; // sets Data input
}	

// block write from external data source - writes nbytes of data from addr
inline void wr_block_ext(unsigned long addr, unsigned long nbytes){
	loadadr(addr);
	DDRC = 0x00; // sets Data input - feed data from an external source
	while (nbytes--){
		// example: clocking by /WR from atmega 
		NWR = 0;  NWR = 1;  // writes data in on rising edge of /WR
		// above could be clocked by external /WR clock source as well
		// writing speed up to 12MB/sec possible		
	}
}

// block read - reads nbytes from addr to data
inline void rd_block(unsigned long addr, unsigned long nbytes, unsigned char *data){
	loadadr(addr);
	while (nbytes--) *data++ = rd_databk();		
}

void loop()
{
	unsigned char di, d;
	unsigned long int timestart, elapsed, elapsedloop, errors, N;
	volatile unsigned long int i;

#define Nblock 1024
	unsigned char data[Nblock];

	// fill in the data block
	for (i=0; i<Nblock; i++) {
		data[i] = 0xCC;
	}

	N = 500000; // = 8388608;  //number of bytes to be written/read

	// empty loop timing
	timestart = millis();
	for (i=0; i<N; i++) {		
	}
	elapsedloop = millis() - timestart;

	Serial.println("RAMDISK: RANDOM BYTE ACCESS RD/WR TEST");

	timestart = millis();
	for (i=0; i<N; i++) {
		wr_data(i,0xAA);		
	}
	elapsed = millis() - timestart - elapsedloop;
	Serial.print("N. of Byte WRITEs: ");
	Serial.println(N);
	Serial.print(N/elapsed); 
	Serial.println(" kB/sec");

	errors = 0;

	timestart = millis();
	for (i=0; i<N; i++) {
		di = rd_data(i);
		//if (di != 0xAA ) errors++;
	}
	elapsed = millis() - timestart - elapsedloop;
	Serial.print("N. of Byte READs: ");
	Serial.println(N);
	//Serial.print("Errors: ");
	//Serial.println(errors);
	Serial.print(N/elapsed); 
	Serial.println(" kB/sec");

	Serial.println("RAMDISK: RANDOM LONG ACCESS RD/WR TEST");
	
	N = 1000000;
	
	// empty loop timing - longs
	timestart = millis();
	for (i=0; i<N; i=i+4) {
	}
	
	elapsedloop = millis() - timestart;

	timestart = millis();
	DDRC = 0xFF; //sets Data output
	for (i=0; i<N; i=i+4) {
		wr_datalong(i,d);
	}
	DDRC = 0x00; //sets Data input
	elapsed = millis() - timestart - elapsedloop;
	Serial.print("N. of LONG WRITEs: ");
	Serial.println(N/4);
	Serial.print(N/elapsed); 
	Serial.println(" kB/sec");

	timestart = millis();
	DDRC = 0x00; //sets Data inp
	for (i=0; i<N; i=i+4) {
		d = rd_datalong(i);
	}
	elapsed = millis() - timestart - elapsedloop;
	Serial.print("N. of LONG READs: ");
	Serial.println(N/4);
	Serial.print(N/elapsed); 
	Serial.println(" kB/sec");

	Serial.println("RAMDISK: 1024B BLOCK ACCESS RD/WR TEST");
	
	// write a block 1024 bytes large from address 0x55555
	timestart = micros();
	wr_block(0x55555, Nblock, data);
	elapsed = micros() - timestart;
	
	Serial.print("N. of Block Byte WRITEs: ");
	Serial.println(Nblock);
	Serial.print(Nblock*1000L/elapsed); 
	Serial.println(" kB/sec");

	errors = 0;

	// read a block 1024 bytes large from address 0x55555
	timestart = micros();
	rd_block(0x55555, Nblock, data);
	elapsed = micros() - timestart;

	//for (i=0; i<Nblock; i++) {
	//	if (data[i] != 0xCC) errors++;
	//}
	
	Serial.print("N. of Block Byte READs: ");
	Serial.println(Nblock);
	Serial.print(Nblock*1000L/elapsed); 
	Serial.println(" kB/sec");

	//Serial.print("Errors: ");
	//Serial.println(errors);

	Serial.println("STOP");

	while(1);
}

I've updated the above code including the "driver" - faster read by utilizing the "read ahead" feature of the atmega's reading port.
Here is an another link to the module:
http://www.rlx.sk/sk/storage-boards-memorystorage-boards-memory/2559-ramdisk-8mb-8mx8-8mbytes-with-8bit-parallel-access.html

quite impressive Pito!
+1

Updated with rd_block() and wr_block() functions and related test results (now the real times inclusive loops). Interesting for someone ho might think about porting fat to it :slight_smile:

would have liked to try it but no method of shipping to Canada ....

would have liked to try it but no method of shipping to Canada ....

They ship world-wide, afaik - outside EU they charge for an insured letter 4.90Euro VAT exclusive.

I've added an example for writing the external memory from other sources as the atmega portC output in above examples.
The input could be an ADC, VGA camera module, etc. The ramdisk reads in the data on rising edge of the /WR signal, the /WR signal could be clocked by an external source as well. The below example writes 614kB of data (640x480x2), while clocking with atmega /WR signal (a slow way to do it). You may clock up to 12MB/sec with an external /WR clock (ie the camera's chip rd signal).

// RAMDISK v.1.1 - 8Mbytes - testing with arduino compatible platform
// RAMDISK MODULE requires 3.3Volt signal levels and power supply!
// Not tested with VGA camera chip
// UPDATE: March 12 2014
// Provided as-is, no warranties of any kind
// c 2014 by Pito
/*
RAMDISK MODULE PINS:

           |-----------------------------
DATA0    --- D0                       L  |
DATA1    --- D1                     @ E  |
DATA2    --- D2                       D  |
DATA3    --- D3                          |
Ground   --- GND                         |
DATA4    --- D4     RAMDISK v1.1         |
DATA5    --- D5                          |
DATA6    --- D6                          |
DATA7    --- D7                          |
NMS      --- /MS Module Select (optional)|
NRD      --- /RD Read signal, 85ns min   |
NWR      --- /WR Write signal, 85ns min  |
NDATA    --- /DATA  (A0 with PMP bus)    |
3V3      --- 3.3Volts power              |
           |-----------------------------

LED - indicates rd/wr memory accesses

*/
/*
 * NRD - read signal active low, min 85ns, mind the delayed input on atmega mcu
 * NWR - write signal active low, min 85ns, max 8us
 * NDATA - access to sram data - low, access to controller  - high
 * NMS - module select - active low (must be hardwired to low with single module)
 */

#include <DigitalIO.h>

// HW interface to the RAMDISK module (an example only, do assign yours control pins):
DigitalPin<12> NRD(OUTPUT);     //    /RD active LOW
DigitalPin<13> NWR(OUTPUT);     //    /WR active LOW
DigitalPin<14> NDATA(OUTPUT);   //    /Data active LOW
// PORTC is the Data bus D0-D7 (an example only)

typedef union {
	unsigned long value;
	struct {
		unsigned char nib1: 4;  // lowest nibble
		unsigned char nib2: 4;
		unsigned char nib3: 4;
		unsigned char nib4: 4;
		unsigned char nib5: 4;
		unsigned char nib6: 4;
		unsigned char nib7: 4;
		unsigned char nib8: 4;  // highest nibble
	};
} 
nybbles ;

void setup()
{
	// init the RAMDISK
	ramdisk_init ();
	Serial.begin(115200);
}

// init the RAMDISK
inline void ramdisk_init (void){
	delay(1);
	NRD = 1;  // Idle
	NWR = 1;  // Idle
	NDATA = 0;  // Idle = data mode
	DDRC = 0x00; // sets DATA PORT as the input
	// make a dummy read
	NRD = 0; 
	NRD = 1;
}

// load the 24bit address into
inline void loadadr (unsigned long addr)
{
	nybbles temp;
	temp.value = addr;

	NDATA = 1;  // adress mode	
	DDRC = 0xFF; // sets Data output

		PORTC = (temp.nib6); // 6th nibble of the address - the highest
		NWR = 0; NWR = 1;    // write the nibble into the RAMDISK address counter
		PORTC = (temp.nib5); // 5th nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib4); // 4th nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib3); // 3rd nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib2); // 2nd nibble of the address
		NWR = 0; NWR = 1;
		PORTC = (temp.nib1); // 1st nibble of the address - the lowest
		NWR = 0; NWR = 1;

	NDATA = 0;  // data mode
	DDRC = 0x00; // sets Data input
}

// block write from external data source - writes nbytes of data from addr
inline void wr_block_ext(unsigned long addr, unsigned long nbytes){
	loadadr(addr);
	DDRC = 0x00; // sets Data input - feed data from an external source
	while (nbytes--){
		// example: clocking by /WR from atmega 
		NWR = 0;  NWR = 1;  // writes data in on rising edge of /WR
		// above could be clocked by external /WR clock source as well
		// writing speed up to 12MB/sec possible		
	}
}

void loop()
{
	unsigned char di, d;
	unsigned long int timestart, elapsed, elapsedloop, errors, N;
	volatile unsigned long int i;

	Serial.println("RAMDISK: N-BLOCK EXT INPUT DATA WR TEST (ATMEGA's /WR CLOCKING)");
	
	// writes a block of N-bytes from address 0x11111
	// the data are fed in via ramdisk's data bus, bypassing atmega port
	// possible inputs - flash ADC, VGA video chip, etc.
	// the /WR works as a clock, it clocks in data on /WR rising edge
	// you may clock  /WR with atmega (this example) or via an external source
	timestart = micros();
	wr_block_ext(0x11111, 614400L);
	elapsed = micros() - timestart;
	
	Serial.print("N. of Block Byte WRITEs: ");
	Serial.println(614400L);
	Serial.print(614400*1000L/elapsed); 
	Serial.println(" kB/sec");

	while(1);
}
RAMDISK: N-BLOCK EXT INPUT DATA WR TEST (ATMEGA's /WR CLOCKING)
N. of Block Byte WRITEs: 614400
1589 kB/sec

I love the idea of using this module but would rather add the components to my own board. Are there any schematics? Or can you tell me the chip being used?


Rob

Rob, afaik the internals are quite complex and not intended for DIY domain. It is considered a cheap "as-is plug and play component" users with experience may utilize for their designs quickly.

Yeah ok, there's a CPLD on there, not much chance of duplicating that.


Rob

Enclosed plz find the version 0.9 of the 8MB Ramdisk driver for the fat16lib's latest "RamDisk File System".
P.

M8MBRDSK11.zip (2.57 KB)

I have no immediate use for this but love the idea. Good work.


Rob

When you need to set the Address to 0 (for example when always writing/reading a buffer from ramdisk's zero Address) you may use this shortcut:

// reset the Address to 0x000000L
inline void resetadr (){
	NDATA = 1;  // address mode	
	NRD = 0;  NRD = 1;
	NDATA = 0;  // data mode
}

Moreover, you can stop writing the Address nibbles after any nibble written :P, thus you may write in only "required" number of Address nibbles (1 or 2 or 3 or 4 or 5 or 6).

That allows to speed up with setting the Address, for example you work with 128 blocks 65kB each, so you must not set all 6 nibbles of the 8MB Address, but only top 2:

// Load the BLOCK Address into the RAMDISK
// a BLOCK is a 65kB large chunk of ram starting at 0x0000
inline void ldadr_block65k (unsigned char addr)
{
	NDATA = 1;  // address mode	
	// the trick with zeroing the 24bit address
	NRD = 0;  NRD = 1;
	// now load the 2 nibbles of the BLOCK address
	DDRA = 0xFF; // sets Data output
		PORTA = (addr >> 4); // 6th nibble of the address - the highest
		NWR = 0; NWR = 1;    // 
		PORTA = (addr); 	// 5th nibble of the address
		NWR = 0; NWR = 1;
	// 4th, 3rd, 2nd, 1st nibble are zero - here we start to read/write
	DDRA = 0x00; // sets Data input
	NDATA = 0;  // data mode
}

With a standard external HW SRAM buses on DUE or PIC32 or others, you may simply write the 24bit Address with A0=1 (when for example you use A0 address line) and then you read/write all the Data bytes from A0=0. You must not mess with bit-banging the /RD and /WR signals, or with the setting of a 8bit Data port direction - that is done by the MCU's external memory bus for you (EMB on DUE, PMP on PIC32, FSMC on STM32, etc.). You have to set the timing of the /RD and /WR signals as required, however.

For example - a pseudo-code:

//write 24bit Address, A0=1
bus_write(1, 6th_nibble); //the highest
bus_write(1, 5th_nibble);
bus_write(1, 4th_nibble);
bus_write(1, 3rd_nibble);
bus_write(1, 2nd_nibble);
bus_write(1, 1st_nibble);
// write 100000 bytes of Data (w/ auto-increment)
for i=1 to 100000 
bus_write(0, data[i]);
next i

or

// read 100000 bytes of Data (w/ auto-increment)
for i=1 to 100000
data[i] = bus_read(0);
next i

When working with blocks (see above):

//write 8bit BLOCK Address (up to 128 BLOCKS each BLOCK 65kB large)
dummy = bus_read(1);  //resets Address to zero (0L)
bus_write(1, 6th_nibble); //the highest
bus_write(1, 5th_nibble);
// write 65kBytes of Data (w/ auto-increment)
for i=1 to 65536 
bus_write(0, data[i]);
next i

or

// read 65kBytes of Data (w/ auto-increment)
for i=1 to 65536 
data[i] = bus_read(0);
next i

You may connect the 8MB Ramdisk to your DUE via DUE's External Memory Bus easily:

With Ax = 1 you write in the 24bit starting "Address" 
With Ax = 0 you write/read the Data bytes sequentially from the "Address"
You may use any DUE address line.

DUE and 8MB RAMDISK.jpg

Hello Pito,

thx for this great work!

I have ordered two of those 8MB Ram devices... This is exactly what I need! :slight_smile:

Kind Regards,

Andreas

With DUE this is an Example how to proceed with 8MB Ramdisk on the DUE's External Memory Bus (just fragments of a code here, needs to be put into your code):

// DUE timing settings and Examples for 8MB Ramdisk
// Pito 4/2014
// Examples only, not tested
// Library used: https://github.com/delsauce/ArduinoDueParallel
// DUE at 84MHz (12ns clock)
// No warranties of any kind

// Wiring:
RDisk	DUE (EMB signal names)
================
D0-D7	D0-D7
/WR	NWE
/RD	NRD
/DATA	A0

// Configure parallel bus for 8bits, no CS, A0, and NRD and NWE
Parallel.begin(PARALLEL_BUS_WIDTH_8, 0, 1, 1, 1);

// Configure bus timings.. EXPERIMENTAL (assumption: WR pulse could be shorter than RD pulse)
// Otherwise WR=RD timing
// We do not use any real addressing
Parallel.setAddressSetupTiming(0,0,0,0);
// NWE, NCSWE, NRD, NCSRD  - we do not use NCSs
Parallel.setPulseTiming(5,0,8,0); // or (4,0,7,0) or (6,0,9,0)
// better read datasheet for CycleTiming:
Parallel.setCycleTiming(6,9);  // or (5,8) or (7,10)


// Setting of the starting Address (8MB max)
set_address(uint32 Address){
	// below sequence could be shorter when setting only some upper nibbles
	MAKE NIBBLES OF Address SOMEHOW - ie. with union as above or with >>
	Parallel.write(0x01, nibb6);  // addr[23..20]
	Parallel.write(0x01, nibb5);
	Parallel.write(0x01, nibb4);
	Parallel.write(0x01, nibb3);
	Parallel.write(0x01, nibb2);
	Parallel.write(0x01, nibb1);  // addr[3..0]
	//Parallel.write(0x01, CONTROL);  // 0x00 default, 0x01-autodecrement, 0x02-wrprotect, 0x03-both
}

// Write 1MB of data=0xAA from Address=0x2FFFFF
set_address(0x2FFFFF);
for(uint32 i=0; i<1MB; i++){
	Parallel.write(0x00, 0xAA);
}

// Read and sum 1MB of data from Address=0x2FFFFF
uint32 sum = 0;
set_address(0x2FFFFF);
for(uint32 i=0; i<1MB; i++){
	sum = sum + Parallel.read(0x00);
}

// Fast reset of the Address to 0x000000
reset_address(){
	Parallel.read(0x01);
}

// Example: addressing of a 65kB large Block
// Max 128 Blocks of 65KB each with 8MB Ramdisk, a pity
set_block_address(uint8 BlockNumber){
	Parallel.write(0x01, BlockNumber>>4);  // addr[23..20]
	Parallel.write(0x01, BlockNumber);  // addr[19..16]
}

// set the Address at the beginning of the Block N.6
reset_address();  // Address is 0x000000 now
set_block_address(0x06); // set 2 highest nibbles only - the Block number
// Write 65kB of data=0xAA into the block N.6
for(uint32 i=0; i<65kB; i++){
	Parallel.write(0x00, 0xAA);
}

// set the Address at the beginning of the Block N.6
reset_address();  // Address is 0x000000 now
set_block_address(0x06); // set 2 highest nibbles only - the Block number
// read and sum 65kB worth of data from the block N.6
uint32 sum = 0;
for(uint32 i=0; i<65kB; i++){
	sum = sum + Parallel.read(0x00);
}

In case you can change the EMB timing parameters on-the-fly with Due (most probable) you can speed up the setting of the Address significantly. The Address is written into the CPLD controller, thus it must not comply with the psram timings.

A 15-20ns long /WR pulse works fine while writing the Address.

// Setting of the starting Address (8MB max)
set_address(uint32 Address){
	// below sequence could be shorter when setting up only some of the nibbles
	MAKE NIBBLES OF Address SOMEHOW - ie. with union as above or with  >>
        SET WRITE PULSE TO 15-20ns WIDTH
	Parallel.write(0x01, nibb6);  // addr[23..20]
	Parallel.write(0x01, nibb5);
	Parallel.write(0x01, nibb4);
	Parallel.write(0x01, nibb3);
	Parallel.write(0x01, nibb2);
	Parallel.write(0x01, nibb1);  // addr[3..0]
	//Parallel.write(0x01, CONTROL);
        SET WRITE PULSE BACK
}

pito, very impressive work you've done!

Your hardware setup looks quite similar to a mini-board I've been reviewing. Its only 1"x2", has 32 MByte SDRAM, 8 Mbit Flash, 600,000-gate FPGA, its completely open source and costs $69. What I like is that the FPGA communicates with the SDRAM as if its standard RAM. They explain how the code can be modified for dual-port, tri-port or quad port memory access. The Flash card could be buffered at such a high level that even the slowest cards shouldn't cause any hiccups.

They have excellent documentation and training manuals published.
I've been "dreaming" about a mini-card like this with a SAM3X on it.

I've been "dreaming" about a mini-card like this with a SAM3X on it.

There's a current thread about one being made, I can't find it now though.

EDIT: Found it, Small-footprint Due - Arduino Due - Arduino Forum


Rob