Storing bunch of 8-digit numbers

Hi.
I have one quite interesting problem.
I can not use any additional HW and am using Mega as main board.

So, i need to check 100 000 ( one hundred thousand ) codes that are entered over keypad.
They should be used in cca 3 months.

How is this done:

  • ADMIN enters first code
  • our program generates code on the basis --> first code + 1 , up to 100 000
  • e.g. if the first code is 100 000, end code will be 200 000
  • all the codes in between will be used as entering codes.
  • each code is valid for one day and it can be used 3 times in this day.

Checking the codes:

  • for now we have set one loop that compares each entered code and writes some value
  • written values are for code validation and number of uses.

Problems:

  • we run into quite problems but it looks that the main one is memory and storing of data.
  • we can not use integer and we can not find some good approach how to check if the code is valid and how may time it was used.
  • it would be simpler if we have to use smaller code scale but 100 000 code , with 8-digit, presents quite a problem.

Any suggestion on how to solve this is very welcome !

Arnix

long

It is not clear why you need 100,000 codes each day.

For that kind of storage I would recommend a file on SD card.

If the codes are sequential then it seems that someone assigned code 100385 would be able to try codes 100384, 100383... until they found a code that had not used all three entries for the day. Seems like a bad design.

unsigned long, can be 0x00000000 to 0xFFFFFFFF (hex)
or 0 to 2^32-1, = to 4,294,967,295 (decimal)
4 bytes of data.
Use EEPROM Write Anything library if need to store them during power outages

You only need to remember the first number. All other numbers are basically offsets.

For now, lets keep it simple and each code can only be used once. So you need an array of 100,000 bits to store the status of each number (used or not used). When the admin sets the first code, the 100,000 bits are cleared. If code 100,000 is used, you set the corresponding bit to 1, when code 100,003 is used, you set the corresponding bit and so one. Problem is that you need 12,500 bytes of memory (ram or eeprom). The problem becomes worse if you need to keep track how often the code is used (that will require 2 bits in the array and hence 25,000 bytes in total).

You need to find a way to compress this further so it fits in available memory, whichever memory it is. Note that the processing for this will also require some memory.

How do you know which code is valid for which day? That also needs to be stored somewhere?

I think you have an impossible task due to the limitations of the Mega and the requirement of no additional hardware. A 32k eeprom would make life so easy.

Just brainstorming here, how about writing to flash using majekw's modified optiboot: GitHub - majekw/optiboot: Small and Fast Bootloader for Arduino and other Atmel AVR chips? It supports ATmega2560. I've never used it so I'm not sure if this would be the solution but the ATmega2560 has lots of flash at least so that seems the best chance if you can't use additional hardware.

Alternatively, if all you are doing is indexing the code every three matches, then use a char array for the code... something like this (error checking isn't perfect)

#define MAX_CODE 20000000UL
#define MIN_CODE 10000000UL

char dailyCode[8];
char buffer[8];

unsigned long code = 9999999;

int keypadTries = 0;

void setup() 
{
  Serial.begin(9600);
  Serial.println("Started...");
  updateCodeArray();
  simplePrint(dailyCode, sizeof(dailyCode));
}

void loop() 
{
  static int index = 0;
  if(Serial.available())
  {
    char thisChar = Serial.read();
    if(thisChar == 'x')
    {
      index = 0;
      Serial.println("got it");
      Serial.print("Input = ");
      simplePrint(buffer, sizeof(buffer));
      Serial.print("Code  = ");
      simplePrint(dailyCode, sizeof(dailyCode));
      if(compareStrings(buffer, sizeof(buffer), dailyCode, sizeof(dailyCode)))
      {
        Serial.println("matched");
        if(++keypadTries > 2)
        {
          keypadTries = 0;
          updateCodeArray();
          Serial.print("new code = ");
          simplePrint(dailyCode, sizeof(dailyCode));
        }
      }
    }
    else
    {
      buffer[index++] = thisChar;
      if(index > sizeof(buffer))
      {
        index = 0;
      }
    }
  }
}

bool compareStrings(char* array1, size_t size1, char* array2, size_t size2)
{
  //bool matched = true;  //EDIT not needed!
  if(size1 != size2) return false;
  for(int i = 0; i < size1; i++)
  {
    if(i[array1] != i[array2]) return false;
  }
  return true;
}

void simplePrint(char* array, size_t size)
{
  for(int i = 0; i < size; i++)
  {
    Serial.print(i[array]);
  }
  Serial.println();
}

void updateCodeArray()
{
  if (++code > (MAX_CODE - 1))
  {
    code = MIN_CODE;
  }
  unsigned long num = code;
  for(int i = 0; i < 8; i++)
  {
    dailyCode[7-i] = static_cast<char>((num % 10) + 48);
    num /= 10;
  }
}

enter an 8 digit code and an 'x' behind it like this:

10000000x

I think we need a bit more info to provide guidance

Can any code be used any day?
What prevents someone to try a different code, 100,000 to choose from consecutively is pretty risky?
Does the machine need to survive a power crash? Can you consider re-issuing new codes in that case?
How many codes are likely to be entered in a 3 month period out of the 100,000?

I agree. It's about time the OP chimed in, or further thought is pointless.

i need to check 100 000 ( one hundred thousand ) codes that are entered over keypad.

  • each code is valid for one day and it can be used 3 times in this day.

I'm not understanding how one code per day for 3 months ends up being 100000 codes?

100000 codes with 8 digits is 400000 bytes, so it's well within the reach of Serial EEPROMs or flash memory...

@westfw

The OP might have 100,000 potential authorized users, give them a code which is valid for 3 months and once they use it the code is only valid for the day and they can use it max 3 times.

It does not mean all of them will use their code. (but as noted before then there is a massive security flaw in that design).

we don't know exactly what the OP wants to achieve, hence the questions above.

for your math - 100000 codes of 8 digits (assuming digits are 0 to 9 hence fit on 4 bits) is indeed 400k byte but you also need to store 2 bits as count flag for the access (accessed 0,1,2,3 times), and for each entry the day it was accessed - can be represented as an unsigned long time_t timecode (4 bytes).

so you really need if you want completeness 4+4 +1 = 9 byte per entry so 900,000 bytes — closer to 1 Meg than 512k EEPROM

I can forsee the initializing of that large memory to be pretty long as well - an EEPROM byte write takes 3 to 4 ms to complete. so for 900,000 bytes to set you will need 1h roughly before you system is operational (4ms x 90000 = 3600 seconds = 1h).

I can forsee the initializing of that large memory to be pretty long as well - an EEPROM byte write takes 3 to 4 ms to complete. so for 900,000 bytes to set you will need 1h roughly

It looks like large EEPROMs typically have a page-write command that would speed that up considerably...

@J-M-L

It can be far more efficient (storage wise) if one does not think in bytes but in bits. Storing the number of times it is used can be stored in 2 bits. So basically an array of 2 bit fields instead of bytes (as I showed earlier).

The same can apply for a date. Store the startdate and all other dates are offsets; for a month, that would require 5 bits, for three month 7 bits.

For a month, this would require 2+5 bits per code so basically 100,000 bytes.

long startcode;
long startdate;

struct PINCODE
{
  byte date: 5;    // day offset to against start date
  byte count: 2;
};

PINCODE codes[100000];

The above wastes 1 bit (can be used to achieve two months) and takes 100,000 bytes plus 8 bytes for the start values.

For three months, I will probably use two arrays

long startcode;
long startdate;

byte dates[100000]; // wastes one bit per offset
byte counts[25000]; // 2 bits per count

Total 125,000 bytes plus 8 bytes for the start values.

Both are far from your 1MByte :wink:

My earlier comment on 32k eeprom was solely based on the count and did not include dates.

indeed Westfw

but this is not as straightforward as it sounds because arduino library does not support it by default, use 32 bytes (wth 30 effective) for storing data and that does not bode well with 64 bit EEPROM pages.

that was discussed here a while back

@sterretje

agree I've not optimized for space in EEPROM besides the 8 chars of the code, more for simplicity of accessing the data and using existing time library directly. by forcing multiple codes to share the same info byte (like 4 codes would use the same byte to represent how often they have been accessed) you increase the wear of the EEPROM by 4 times.

My view is that depending on the OP use case scenario (high access or low access for the code) - you might just be better off not initializing anything and maintaining an array of used codes which you build dynamically on demand rather than having everything set up initially

so having a bit more info on what the intent is would help to find what are the best trade offs

Thank you guys for reply.
I was struggling all night to solve this because i must complete it till tomorrow.
Altogether the code is working but i get some validation delay.
If i enter some code in defined range, i have to wait around 5 -8 seconds for validation.

I would like to share the whole code that we have so far but i can't because of business policy.
Therefore i have send PM to sterrutje and bulldogLowell with full code, so they can just check out and see how is this done.

PART of the code for validation:

  String stringKod = String(Input);

      for ( long i = prviKod; i <= prviKod + 50000 ; i++) {
        // check if found
        String  ip = String(i);

        if (stringKod == ip) {                        
          
          for (int j = 0; j < brojKodova; j++ ) {
            if (GenKodovi1[j].sifra == stringKod) {
              GenKodovi1[j].brojUpotrebe = GenKodovi1[j].brojUpotrebe + 1;
            }
            else {
              GenKodovi1[pozicija].sifra = stringKod;
              GenKodovi1[pozicija].brojUpotrebe = GenKodovi1[pozicija].brojUpotrebe + 1;
              pozicija++;

            }
            break;
          }
          break;
        }
      }

      for ( long i = drugiKod; i <= drugiKod + 50000 ; i++) {
        String  ip = String(i);
        if (stringKod == ip) {                              //Serial.println("HIT");
          
          for (int j = 0; j < brojKodova; j++ ) {
            if (GenKodovi1[j].sifra == stringKod) {
              GenKodovi1[j].brojUpotrebe = GenKodovi1[j].brojUpotrebe + 1;
            }
            else {
              GenKodovi1[pozicija].sifra = stringKod;
              GenKodovi1[pozicija].brojUpotrebe = GenKodovi1[pozicija].brojUpotrebe + 1;
              pozicija++;
            }
            break;
          }
          break;
        }
      }
     
      // OK <--> OK / WRONG
      bool pogodjeno = false;

      for (int i = 0; i < brojKodova; i++) {
        if (GenKodovi1[i].sifra == stringKod && GenKodovi1[i].brojUpotrebe <= brojKoristenjaKoda) {
          lcd.print("VAZECI");
          otvoriRelay();
          pogodjeno = true;
          break;
        }
      }

This is the situation:

  • you have two cash registers
  • each cash register will generate around 200 bills per day
  • on each bill you have "transaction number" - 8 digit code, and this is used for code verification.
  • user takes this bill and types the code
  • if the code is used less then 5 times and in the same day, code is valid.
  • coin acceptor is also built in but i have to add it into this code...
  • btw. electronics will be shut down each evening, so we are using this instead of day counter !

Real time situation:

  • ADMIN enters code for first cash register ( e.g. 50 000 )
  • our program stores this variable
  • ADMIN enters code for second cash register ( e.g. 100 000 )
  • our program stores this variable
  • loop checks if there is a code in range 50 000 - 100 000
  • and if there is a code in range 100 000 - 150 000
  • if so, and if everything else is OK, code is valid.

Arnix

why do you use String objects when a long could do?

for example when you do

  String stringKod = String(Input);

      for ( long i = prviKod; i <= prviKod + 50000 ; i++) {
        // check if found
        String  ip = String(i);

        if (stringKod == ip) {  
...

you have to allocate memory for a String 50,000 times and build its decimal representation... that's slow and also creates a massive fragmentation of your memory...

you would be much better off to transform the string you received as Input into a long number and then do arithmetics on the number rather than compare on String objects

char * pEnd;
long longKod = strtol (Input.c_str(), &pEnd, 10); // if this returns zero, then could not parse the input. should check for robustness... I let you do this

      for ( long i = prviKod; i <= prviKod + 50000 ; i++) {
        // check if found

        if (longKod == ip) {
...

you would be much faster because comparing numbers is easier than comparing Strings objects and use way less memory.

Sorry guys i could not reply sooner !
Thank you all for participating in this thread.

We implemented suggestion from J-M-L and mixed it with idea from "sterretje and B_D_Lowell".
The electronics was installed on Monday afternoon and its working perfect.

Thnx again !