Best method to find repeated value

I've already specified it was just an "imaginary" example, to get the idea across. As I also said how it doesn't make sense for actual sensor readings. Which is not what I am doing.

An example of what? What idea? What are you doing?

Frequently, contributors here come up with more masterful solutions, when a reasonably complete actual problem is presented. Doing it by example, does not offer as much potential.

Firstly, a solution has been presented. There are 3 different solutions available.
Secondly, I did specify exactly what I am doing in one of the initial comments.

That the ML models threshold the info into hard-edged classes and then you combine them in this way seems like it discards information. A good ML model should also be able to give its quality/certainty about its favorite (and not favorite) answers, which could be combined with traditional means.

Neural Networks do give accuracy per class.
ML models do not. Unless something like XGBoost with softmax function or Relevance Vector Machines, is used. The purpose of ML models is to clasify therefore, select a "label".
CNN models use the same softmax function as XGboost so it displays the output.

CNN, with 40k samples, is not possible on Arduino, even with TensorFlow Lite or Edge Impulse.
Meanwhile, the ML models on Arduino do not provide softmax functions (not even on PC, directly in Python, unless the ML can change the function like XGBoost, as I said.

What many data scientists do with difficult to classify problems, is they use multiple ML models. Sorcialized for different outputs, and they make a decision tree/flow chart to choose the best outcome.

In fact there are already 3 algorithms: Random Forest, Decision Tree and XGBoost that do just that.

In terms of regression models you get a float output, so the story is different there.

Nevertheless, this post is not about ML functionality and what not. So I'd rather end it here.

Many thanks to the users that contributed with the different solutions!

This is perhaps the least elegant solution. Certainly not the best.

It is based on a lookup table, one entry for every a,b,c,d combination, for a total of 5^4=625 combinations each with two possibilities, one winner or two winners (a tie).

Every (value, frequency) pair is encoded in a byte of the lookup table, the value is stored in the higher nibble, its frequency in the lower nibble, so the decimal numbers you see will make little sense. A table entry is zero when there is not a corresponding useful value combination ("no winner").

The table itself is not optimized in any way, to make code small. Its size is 1250 bytes. You can see that we have 120 combinations where there is no winner, so the entry is {0,0}

Only the values array and the getFrequency() function are really used, the rest helps only to test the functionality.

//https://forum.arduino.cc/t/best-method-to-find-repeated-value/982067

byte values[625][2] = {
  {4, 0} , {3, 0} , {3, 0} , {3, 0} , {3, 0} , {3, 0} , {2, 18} , {2, 0} , {2, 0} , {2, 0} , {3, 0} , {2, 0} , {2, 34} , {2, 0} , {2, 0} , {3, 0} , {2, 0} , {2, 0} , {2, 50} , {2, 0} ,
  {3, 0} , {2, 0} , {2, 0} , {2, 0} , {2, 66} , {3, 0} , {2, 18} , {2, 0} , {2, 0} , {2, 0} , {2, 18} , {19, 0} , {18, 0} , {18, 0} , {18, 0} , {2, 0} , {18, 0} , {34, 0} , {0, 0} , {0, 0} ,
  {2, 0} , {18, 0} , {0, 0} , {50, 0} , {0, 0} , {2, 0} , {18, 0} , {0, 0} , {0, 0} , {66, 0} , {3, 0} , {2, 0} , {2, 34} , {2, 0} , {2, 0} , {2, 0} , {18, 0} , {34, 0} , {0, 0} , {0, 0} ,
  {2, 34} , {34, 0} , {35, 0} , {34, 0} , {34, 0} , {2, 0} , {0, 0} , {34, 0} , {50, 0} , {0, 0} , {2, 0} , {0, 0} , {34, 0} , {0, 0} , {66, 0} , {3, 0} , {2, 0} , {2, 0} , {2, 50} , {2, 0} ,
  {2, 0} , {18, 0} , {0, 0} , {50, 0} , {0, 0} , {2, 0} , {0, 0} , {34, 0} , {50, 0} , {0, 0} , {2, 50} , {50, 0} , {50, 0} , {51, 0} , {50, 0} , {2, 0} , {0, 0} , {0, 0} , {50, 0} , {66, 0} ,
  {3, 0} , {2, 0} , {2, 0} , {2, 0} , {2, 66} , {2, 0} , {18, 0} , {0, 0} , {0, 0} , {66, 0} , {2, 0} , {0, 0} , {34, 0} , {0, 0} , {66, 0} , {2, 0} , {0, 0} , {0, 0} , {50, 0} , {66, 0} ,
  {2, 66} , {66, 0} , {66, 0} , {66, 0} , {67, 0} , {3, 0} , {2, 18} , {2, 0} , {2, 0} , {2, 0} , {2, 18} , {19, 0} , {18, 0} , {18, 0} , {18, 0} , {2, 0} , {18, 0} , {34, 0} , {0, 0} , {0, 0} ,
  {2, 0} , {18, 0} , {0, 0} , {50, 0} , {0, 0} , {2, 0} , {18, 0} , {0, 0} , {0, 0} , {66, 0} , {2, 18} , {19, 0} , {18, 0} , {18, 0} , {18, 0} , {19, 0} , {20, 0} , {19, 0} , {19, 0} , {19, 0} ,
  {18, 0} , {19, 0} , {18, 34} , {18, 0} , {18, 0} , {18, 0} , {19, 0} , {18, 0} , {18, 50} , {18, 0} , {18, 0} , {19, 0} , {18, 0} , {18, 0} , {18, 66} , {2, 0} , {18, 0} , {34, 0} , {0, 0} , {0, 0} ,
  {18, 0} , {19, 0} , {18, 34} , {18, 0} , {18, 0} , {34, 0} , {18, 34} , {35, 0} , {34, 0} , {34, 0} , {0, 0} , {18, 0} , {34, 0} , {50, 0} , {0, 0} , {0, 0} , {18, 0} , {34, 0} , {0, 0} , {66, 0} ,
  {2, 0} , {18, 0} , {0, 0} , {50, 0} , {0, 0} , {18, 0} , {19, 0} , {18, 0} , {18, 50} , {18, 0} , {0, 0} , {18, 0} , {34, 0} , {50, 0} , {0, 0} , {50, 0} , {18, 50} , {50, 0} , {51, 0} , {50, 0} ,
  {0, 0} , {18, 0} , {0, 0} , {50, 0} , {66, 0} , {2, 0} , {18, 0} , {0, 0} , {0, 0} , {66, 0} , {18, 0} , {19, 0} , {18, 0} , {18, 0} , {18, 66} , {0, 0} , {18, 0} , {34, 0} , {0, 0} , {66, 0} ,
  {0, 0} , {18, 0} , {0, 0} , {50, 0} , {66, 0} , {66, 0} , {18, 66} , {66, 0} , {66, 0} , {67, 0} , {3, 0} , {2, 0} , {2, 34} , {2, 0} , {2, 0} , {2, 0} , {18, 0} , {34, 0} , {0, 0} , {0, 0} ,
  {2, 34} , {34, 0} , {35, 0} , {34, 0} , {34, 0} , {2, 0} , {0, 0} , {34, 0} , {50, 0} , {0, 0} , {2, 0} , {0, 0} , {34, 0} , {0, 0} , {66, 0} , {2, 0} , {18, 0} , {34, 0} , {0, 0} , {0, 0} ,
  {18, 0} , {19, 0} , {18, 34} , {18, 0} , {18, 0} , {34, 0} , {18, 34} , {35, 0} , {34, 0} , {34, 0} , {0, 0} , {18, 0} , {34, 0} , {50, 0} , {0, 0} , {0, 0} , {18, 0} , {34, 0} , {0, 0} , {66, 0} ,
  {2, 34} , {34, 0} , {35, 0} , {34, 0} , {34, 0} , {34, 0} , {18, 34} , {35, 0} , {34, 0} , {34, 0} , {35, 0} , {35, 0} , {36, 0} , {35, 0} , {35, 0} , {34, 0} , {34, 0} , {35, 0} , {34, 50} , {34, 0} ,
  {34, 0} , {34, 0} , {35, 0} , {34, 0} , {34, 66} , {2, 0} , {0, 0} , {34, 0} , {50, 0} , {0, 0} , {0, 0} , {18, 0} , {34, 0} , {50, 0} , {0, 0} , {34, 0} , {34, 0} , {35, 0} , {34, 50} , {34, 0} ,
  {50, 0} , {50, 0} , {34, 50} , {51, 0} , {50, 0} , {0, 0} , {0, 0} , {34, 0} , {50, 0} , {66, 0} , {2, 0} , {0, 0} , {34, 0} , {0, 0} , {66, 0} , {0, 0} , {18, 0} , {34, 0} , {0, 0} , {66, 0} ,
  {34, 0} , {34, 0} , {35, 0} , {34, 0} , {34, 66} , {0, 0} , {0, 0} , {34, 0} , {50, 0} , {66, 0} , {66, 0} , {66, 0} , {34, 66} , {66, 0} , {67, 0} , {3, 0} , {2, 0} , {2, 0} , {2, 50} , {2, 0} ,
  {2, 0} , {18, 0} , {0, 0} , {50, 0} , {0, 0} , {2, 0} , {0, 0} , {34, 0} , {50, 0} , {0, 0} , {2, 50} , {50, 0} , {50, 0} , {51, 0} , {50, 0} , {2, 0} , {0, 0} , {0, 0} , {50, 0} , {66, 0} ,
  {2, 0} , {18, 0} , {0, 0} , {50, 0} , {0, 0} , {18, 0} , {19, 0} , {18, 0} , {18, 50} , {18, 0} , {0, 0} , {18, 0} , {34, 0} , {50, 0} , {0, 0} , {50, 0} , {18, 50} , {50, 0} , {51, 0} , {50, 0} ,
  {0, 0} , {18, 0} , {0, 0} , {50, 0} , {66, 0} , {2, 0} , {0, 0} , {34, 0} , {50, 0} , {0, 0} , {0, 0} , {18, 0} , {34, 0} , {50, 0} , {0, 0} , {34, 0} , {34, 0} , {35, 0} , {34, 50} , {34, 0} ,
  {50, 0} , {50, 0} , {34, 50} , {51, 0} , {50, 0} , {0, 0} , {0, 0} , {34, 0} , {50, 0} , {66, 0} , {2, 50} , {50, 0} , {50, 0} , {51, 0} , {50, 0} , {50, 0} , {18, 50} , {50, 0} , {51, 0} , {50, 0} ,
  {50, 0} , {50, 0} , {34, 50} , {51, 0} , {50, 0} , {51, 0} , {51, 0} , {51, 0} , {52, 0} , {51, 0} , {50, 0} , {50, 0} , {50, 0} , {51, 0} , {50, 66} , {2, 0} , {0, 0} , {0, 0} , {50, 0} , {66, 0} ,
  {0, 0} , {18, 0} , {0, 0} , {50, 0} , {66, 0} , {0, 0} , {0, 0} , {34, 0} , {50, 0} , {66, 0} , {50, 0} , {50, 0} , {50, 0} , {51, 0} , {50, 66} , {66, 0} , {66, 0} , {66, 0} , {50, 66} , {67, 0} ,
  {3, 0} , {2, 0} , {2, 0} , {2, 0} , {2, 66} , {2, 0} , {18, 0} , {0, 0} , {0, 0} , {66, 0} , {2, 0} , {0, 0} , {34, 0} , {0, 0} , {66, 0} , {2, 0} , {0, 0} , {0, 0} , {50, 0} , {66, 0} ,
  {2, 66} , {66, 0} , {66, 0} , {66, 0} , {67, 0} , {2, 0} , {18, 0} , {0, 0} , {0, 0} , {66, 0} , {18, 0} , {19, 0} , {18, 0} , {18, 0} , {18, 66} , {0, 0} , {18, 0} , {34, 0} , {0, 0} , {66, 0} ,
  {0, 0} , {18, 0} , {0, 0} , {50, 0} , {66, 0} , {66, 0} , {18, 66} , {66, 0} , {66, 0} , {67, 0} , {2, 0} , {0, 0} , {34, 0} , {0, 0} , {66, 0} , {0, 0} , {18, 0} , {34, 0} , {0, 0} , {66, 0} ,
  {34, 0} , {34, 0} , {35, 0} , {34, 0} , {34, 66} , {0, 0} , {0, 0} , {34, 0} , {50, 0} , {66, 0} , {66, 0} , {66, 0} , {34, 66} , {66, 0} , {67, 0} , {2, 0} , {0, 0} , {0, 0} , {50, 0} , {66, 0} ,
  {0, 0} , {18, 0} , {0, 0} , {50, 0} , {66, 0} , {0, 0} , {0, 0} , {34, 0} , {50, 0} , {66, 0} , {50, 0} , {50, 0} , {50, 0} , {51, 0} , {50, 66} , {66, 0} , {66, 0} , {66, 0} , {50, 66} , {67, 0} ,
  {2, 66} , {66, 0} , {66, 0} , {66, 0} , {67, 0} , {66, 0} , {18, 66} , {66, 0} , {66, 0} , {67, 0} , {66, 0} , {66, 0} , {34, 66} , {66, 0} , {67, 0} , {66, 0} , {66, 0} , {66, 0} , {50, 66} , {67, 0} ,
  {67, 0} , {67, 0} , {67, 0} , {67, 0} , {68, 0}
};

// test combination
int a ;
int b ;
int c ;
int d ;

void setup() {
  Serial.begin(115200);

  Serial.print("Type 4 numbers from 0 to 4:");
}

void loop() {
  if (Serial.available()) {
    String numbers = Serial.readString();
    a = numbers.charAt(0) - '0';
    b = numbers.charAt(1) - '0';
    c = numbers.charAt(2) - '0';
    d = numbers.charAt(3) - '0';

    if (a < 5 && b < 5 && c < 5 && d < 5) {
      getFrequency(a, b, c, d);
    } else {
      Serial.println("Out of range...");
    }
    Serial.print("\nType 4 numbers from 0 to 4:");
  }
}

void getFrequency(int a, int b, int c, int d) {
  // find frequency of a,b,c,d

  int index = 125 * a + 25 * b + 5 * c + d;

  Serial.print(F("\nCombination: "));
  Serial.print(index, 5); // use base-5. Leading zeros wil be missing...
  Serial.print(F(" -> "));

  // get values. We have 3 cases
  // 1. values[index][0] == 0 --> no winner
  // 2. values[index][1] == 0 --> one winner
  // 3. values[index][1] != 0 --> tie

  if (values[index][0] == 0) {
    Serial.println("\tNo winner");
  }
  else {
    byte val = values[index][0] >> 4;
    byte freq = values[index][0] & 0b1111;
    Serial.print(val); Serial.print(F(" repeats ")); Serial.print(freq); Serial.print(F(" times\t"));

    if (values[index][1] != 0) {
      val = values[index][1] >> 4;
      freq = values[index][1] & 0b1111;
      Serial.print(val); Serial.print(F(" repeats ")); Serial.print(freq); Serial.print(F(" times\t"));
    }
    Serial.println();
  }
}

1 Like

Replace 0, 1, 2, 3, 4 with 2, 3, 5, 7, 11. Multiply all values with each other. Check if the product is divisible with the squares of 2, 3, 5, 7, 11. What did I win?

1 Like

Elegant indeed! ... not sure how fast it would be. If I ever stumbled across that code it had better have some good comments :thinking:

You outlined the operation of the code, but not what it is used for, how it is applied. That is important when considering qualities of the input and output data.

But if you got what you needed, then fine.

I had a stab at implementing @Johan_Ha 's suggestion.

Not sure my attempt is the best way to do it, and obviously doesn't cover the edge cases OP wanted but it was fun to test out:

byte vals[] = { 2, 2, 1, 3 };
byte primes[] = { 2, 3, 5, 7, 11 }; 

byte mostCommon(byte vals[], byte size) {
  int total = 1;
  for (byte i = 0; i < size; i++)
    total *= primes[vals[i]];
  for (byte i = 0; i < size; i++) {
    if (!fmod(total, sq(primes[vals[i]])))
      return vals[i];
  }
  return -1; // Or print some error perhaps?
}

void setup() {
  Serial.begin(115200);
  Serial.println(mostCommon(vals, 4));
}

void loop() { }

One thing is creating an elegant algorithm to give the answers you want. Another thing is to create a user interface for the function.

The latter task could consist of the four global variables, a, b, c and d. And a function that saves all the crucial info in three variables that you include in the function call.

int most, most2, n;
... // set a, b, c and d
myFunction(&most, &most2, &n);

most would contain the value that repeats most, n would tell how many times it occured and most2 would contain another value that occured equaly many times, or -1 if there was none.
That would be the "interface". What happens inside myFunction() would be the former task.

@anon46966594 I personally implemented @Johan_Ha's solution like this, no need for 2 loops:

void setup() {
  Serial.begin(9600);

  uint8_t check[5] = {2*2, 3*3, 5*5, 7*7, 11*11};   ////x*x is faster than pow(x,2), requires less flash

  //let's consider this our output, the elements of this array can be assigned to anything later in code
  uint8_t labels[4] = {2, 3, 5, 5};            
  uint8_t product = labels[0] * labels[1] * labels[2] *labels[3];
  
  for (unint8_t i = 0; i < 5; i++) {
    if ((product % check[i]) == 0) {
      //this will output the label as being 0,1,2,3,4 (which is the original, and better naming)
      Serial.println(String(i) + " is the chosen label");
    }
  }

}

void loop() {
}

Time to execute: 23.15 ms
Flash usage: 45028 bytes
Ram: 9752 bytes

By default it outputs 1 label if it repeated 3 or 4 times.
It outputs 2 labels if they repeated 2 times each.
Would still need another variable + if, to check if no variable has repeated 2 times, so just print "All repeated once". Which can be done with a simple flag.

I would call @Johan_Ha's the most "elegant".
@red_car's second solution would be the most optimised at the expense of code footprint. The solution is similar to @Coding_Badly, only a few bytes less flash used.
Flash: 44504 bytes
Ram: 9752 bytes
Time to execute: 7.22ms

@red_car's original solution with the for loop, has a computation time of 17.75ms.
Flash usage: 44856 bytes
Ram usage: 9808 bytes

Edit: @Johan_Ha solution with a flag if everything repeated equally can look like this:

  uint8_t check[5] = {2*2, 3*3, 5*5, 7*7, 11*11};   ////x*x is faster than pow(x,2), also requires less flash

  
  uint8_t labels[4] = {2, 3, 5, 7};
  uint8_t product = labels[0] * labels[1] * labels[2] *labels[3];

  byte repeat = 0;
  for (uint8_t i = 0; i < 5; i++) {
    if ((product % check[i]) == 0) {
      Serial.println(String(i) + " is the chosen label");
      repeat++;
    }
  }
  if (repeat == 0) {Serial.println("All labels repeated once");};

Flash: 45116 bytes
Ram: 9752 bytes
Execution time: 23.15ms when labels repeat, 26ms with all labels are equal in repetition.

With more granularity in the output, it would look like this:

  uint8_t check[5] = {2 * 2, 3 * 3, 5 * 5, 7 * 7, 11 * 11}; ////x*x is faster than pow(x,2), also requires less flash
  uint8_t labels[4] = {2, 3, 3, 2};
  
  byte repeat = 0;
  uint8_t product = labels[0] * labels[1] * labels[2] * labels[3];
  for (uint8_t i = 0; i < 5; i++) {
    if ((product % check[i]) == 0) {
      if (repeat == 0) Serial.println(String(i) + " is the chosen label");
      else Serial.println("There is also the chance of: " + String(i));
      repeat++;
    }
  }
  if (repeat == 0) Serial.println("All labels repeated once. XGBoost has been selected!");
1 Like

Big-O disagrees. :wink:

Best of five?

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.