BLE Scan crashes in dense environments when using u8g2

Hello Guys,

I am having an issue with my Nano 33 BLE and hoping you can give me some hand with solving it.
The idea is a something like super simple "keyless access" as cars do.
Arduino scans for devices and if it finds a BLE Tag- you got a green light, system is unlocked.

For that I am doing a scan every 3 seconds.
If I find a key- corresponding flag is set

If key was not found during last 3 scans (no one is perfect)- system goes to lock state

Everything works perfectly until I'm going to some big streets with lots of cars and people around
Arduino then crashes and freezes (I can see it on OLED display, so it's not like just no key)

Here's the sources of scanning code:

#include <ArduinoBLE.h>

String keyUUID = "00ff";
String keyAddress[2];
String keyName = "Tile";
double rescanTime = 3.0; // Seconds
int rescanTries = 0;
int rescanLimit = 3;   
bool keyFound = false;    // Key found during current scan session
unsigned long lastScan = 0;
bool scanExpired = false;

bool scanActive = false;
bool rescanRequired = false;
double rescanTimeout = 1;  // Sec, time between scans
unsigned long rescanStopped = 0;

void securitySetup() {
   // begin initialization
  keyAddress[0] = "59:9e:c7:22:45:38";
  keyAddress[1] = "4c:1f:20:37:b5:0d";
  // Lock status LED
  pinMode(13, OUTPUT);
  digitalWrite(13, LOW);
   
  Serial.begin(9600);
  //while (!Serial);
 
  if (!BLE.begin()) {
    Serial.println("starting BLE failed!");

    while (1);
  }

  Serial.println("BLE Central scan");
  
  rescan();
}

void rescanRequest(){
  BLE.stopScan();

  scanActive = false;
  rescanStopped = millis();
  lastScan = millis();
}

void rescan(){
  //Serial.println("Rescan");
  keyFound = false;
  scanExpired = false;
  BLE.scanForUuid(keyUUID);
  //BLE.scanForAddress(keyAddress[0]);
  scanActive = true;
  lastScan = millis();
}

void rescanManager(){
  if(!scanActive){
    if(millis() - rescanStopped >= (rescanTimeout * 1000)){
      rescan();
    }
  }
}

void securityLoop() {
    // put your main code here, to run repeatedly:
    BLEDevice peripheral = BLE.available();
 
    if (peripheral) { // Devices discovered
      if (peripheral.hasAdvertisedServiceUuid()) {
        for (int i = 0; i < peripheral.advertisedServiceUuidCount(); i++) {
          if(peripheral.advertisedServiceUuid(i) == keyUUID && peripheral.hasLocalName() && peripheral.localName() == keyName && (peripheral.address() == keyAddress[0] || peripheral.address() == keyAddress[1])){
            //Serial.print("Key found in ");
            //Serial.print(millis() - lastScan);
            //Serial.println("ms");
            if(!unlocked){
              Serial.println("Key found");
            }
            rescanTries = 0;
            unlocked = true;
            keyFound = true;
          }
        }
      }
    }
 
    scanExpired = (millis() - lastScan) > (1000 * rescanTime);
 
    if(scanExpired){  // Time for rescan to check if key is still there
      if(!keyFound){
        rescanTries += 1;
        Serial.print("No key, retry, att. ");
        Serial.println(rescanTries);
        if(rescanTries >= rescanLimit){
          unlocked = false;
          rescanTries = rescanLimit;
          Serial.println("No key, lock");
        }
      }
      rescanRequest();
    }

    rescanManager();
}

I am thinking maybe about some overflowing that occurs for too much peripherals results.
But that is why I am using scanForUUID, just to filter results and only read useful tags.
Or maybe some devices may crash the scan procedure by some reason.
Has any of you saw something like that?
Would be super cool to solve that issue

Thanks

Maybe your Arduino is punishing your for breaking the BLE rules and not using a 128-bit random UUID.

Your code still uses double for no reason.

Did you try to use an Arduino Nano 33 IoT as suggested in August 2020?

I've tried not using UUID's at all and only using "localName's" and scanForName()
No change, still crashing while scanning.
I've also bught Nano 33 IoT, but my sketch seems to be not working on it normally.
It "not findnig" a key, I am still figuring out why
And could you please give me some example of right way of using UUID's?
How should I declare it
I took my value ("00ff") from bluetooth scanning utility- it is a UUID of battery level characterisitc service. (the real one is different from what I've posted, just don't want to share it since it is a security system)

UPD:
From this example (ArduinoBLE - Arduino Reference) I can see that my UUID declaration absolutely matches the required format- four lower case char string
Because we're talking here about service UUID and not device UUID, right? :slight_smile:

What is the BLE-Tag? Is that another Arduino or do you use a commercial product? Do you have a datasheet for that or if it is your own can you post the code?

When you are scanning for battery level characteristics you will get many results when you scan because all mobile phones have a battery level service. That why using a random 128-bit UUID should work. The chance of that existing a second time is realistically zero. That is why 128-bit was chosen by the Bluetooth SiG. They wanted everyone to be able to create a UUID without the need to asking them.

Here is an example of a 128-but UUID in the format supported by the ArduinoBLE library.

"A19BE17B-E11E-E420-93F4-C7ED525FA275"

There are online generators or you can write a small script.

Because UUIDs are public it is not a good idea to use them as a security feature. If you let me know what the BLE tag is we can look for a better alternative.

The tag is Gigaset G-Tag
Also I used another one (some cheap Chinese thing) , it had different service with different UUID, it was not even recognised as something useful (looked like something really random, so should not be that common in devices)
And and still crashed
And I guess if there was something wrong with UUID it would more likely to not work at all.
But in my case it does. And even cleaning the code from everything UUID related (scanning for name or address, which is for sure unique) still results into the same thing.
Another thing I've tried today- is updating board core from 1.1.3 to 1.3.0 but it made it even worth.
It started crashing even in places where it used to work properly. But it is still possible to make it run with 3-5 reboots.
That makes me really frustrated
Reverted back to 1.1.3 for now

Concerning the security issues- I don't need it to be perfectly unbreakable or something.
It is not a mass product, and that fact itself should be enough to provide good protection.
I think BLE Tag is the only way to build an access system like keyless access on cars.
But everything that comes to bluetooth can be read, mocked up and hacked.
It is possible to make a custom tag with keys exchanging or something like that, but it's too complicated for that particular task. I am ok with just few values check (name, address and uuid)

kaws:
The tag is Gigaset G-Tag

I had a look at the documentation. Not very useful. It is for end user who do not need to know how this thing works.

kaws:
And I guess if there was something wrong with UUID it would more likely to not work at all.

Your error description could suggest that your Arduino finds too many devices when it scans and that causes the issue. So, scanning for a unique UUID could solve the issue. Of course, it could be something else.

kaws:
Concerning the security issues- I don't need it to be perfectly unbreakable or something.
...

I agree, having a unique solution is likely good enough.

kaws:
I've also bught Nano 33 IoT, but my sketch seems to be not working on it normally.

That is strange. I found the Arduino Nano 33 IoT to be more reliable.

I just tried to compile the sketch you posted. It does not compile. The variable unlocked does not exist and setup() and loop() have been renamed by you and therefore are missing.

I tried to follow your code a bit but find it hard.

  • Why did you rename loop() and setup()?

  • Why did you choose the name rescan instead of scan?

  • Why is the function rescanRequest called that. It stops scanning. What is the request?

  • What is the rescanManager managing?

  • How did you choose the functions purpose?

  • Why did you initialize keyAddress in setup and not in the declaration like all the other global variables?

  • Why do you set scanExpired in the rescan function. This value is never used, because it is set before the only if().

  • Why do you use double?

  • rescanLimit, rescanTimeout, rescanTime do not need to be a variables. They are constant values. A #define would be enough.

  • The LED is initialized but never used. Your comment suggest it is used for lock status.

Ok, sorry
I uploaded the full sketch. What I posted was just a bluetooth part
Uploaded test version is UUID free, the one I've talked about earlier
By the way- core 1.3.0 is crashing at this line - detachInterrupt(digitalPinToInterrupt(speedPin)); at function getSpeed()
But even getting rid of it did not improve bluetooth behavior, as I said- it's even worse
So

  1. securitySetup() and securityLoop() are called from normal setup() and loop()
  2. I don't think it matters that much but, but mostly it turns in after previus scan (except for the first time) so to me logically it is more of a rescan
  3. rescanRequest() is recording the timestamp at a time scan was stopped. I thought it is a good idea to give board some time to take a chill in-between scans rather then firing them up one after each other
  4. rescanManager() is managing rescan :smiley:
    It checking the timestamp of when scan was stopped and if enough time passed (1 sec right now) - doing rescan
  5. Described above :slight_smile:
  6. I forgot how to declare string array this way, on the weekend I'll get some time to optimize my code with your suggestions.
  7. You right, it is set in securityLoop by timestamps delta, I don't need it to be reset there.
    8-9. To set those times in seconds and still being able to make it something like 1.5. I agree with you that defines will do the job better.
  8. Yes, in full code it represen status of the unlock pin and unlock status

My board is having oled display attached, but it should also work on your side without it just fine
And also there's a warning during compilation about incompatibility of JC_Button with current architecture, but it seems to run perfectly fine since it's a simple button library and Nano BLE having internal pull-up for inputs

dashboard_noUUID.zip (6.6 KB)

I am back after 3 month :slight_smile:
I've made some progress on investigating on what is going on here, by sitting in the park with Laptop and some weird board attached, looking like a cheap copy of Mr. Robot.
Problem is now located, but still not solved
The whole mess is caused by OLED and BLE libraries working together
Here is a simple sketch that also misbehaves
It is basically the BLE scan example sketch with added u8g2 library
It's not even necessary to use it, problem comes up after .begin() (even an OLED display is not necessary, it does not matter if it's connected or not)

/*
  Scan

  This example scans for BLE peripherals and prints out their advertising details:
  address, local name, adverised service UUID's.

  The circuit:
  - Arduino MKR WiFi 1010, Arduino Uno WiFi Rev2 board, Arduino Nano 33 IoT,
    Arduino Nano 33 BLE, or Arduino Nano 33 BLE Sense board.

  This example code is in the public domain.
*/

#include <ArduinoBLE.h>
#include <U8g2lib.h>
#include <U8x8lib.h>

U8G2_SH1106_128X64_NONAME_F_HW_I2C u8g2(U8G2_R0, 0, A4, A5);

void setup() {
  Serial.begin(9600);
  while (!Serial);

  // begin initialization
  if (!BLE.begin()) {
    Serial.println("starting BLE failed!");

    while (1);
  }

  Serial.println("BLE Central scan");

  // start scanning for peripheral
  //BLE.scan();
  BLE.scanForName("Gigaset G-tag");    // This should help filtering results to consume less RAM, but still crash is there

  u8g2.begin();
}

void loop() {
  // check if a peripheral has been discovered
  BLEDevice peripheral = BLE.available();

  if (peripheral) {
    // discovered a peripheral
    Serial.println("Discovered a peripheral");
    Serial.println("-----------------------");

    // print address
    Serial.print("Address: ");
    Serial.println(peripheral.address());

    // print the local name, if present
    if (peripheral.hasLocalName()) {
      Serial.print("Local Name: ");
      Serial.println(peripheral.localName());
    }

    // print the advertised service UUIDs, if present
    if (peripheral.hasAdvertisedServiceUuid()) {
      Serial.print("Service UUIDs: ");
      for (int i = 0; i < peripheral.advertisedServiceUuidCount(); i++) {
        Serial.print(peripheral.advertisedServiceUuid(i));
        Serial.print(" ");
      }
      Serial.println();
    }

    // print the RSSI
    Serial.print("RSSI: ");
    Serial.println(peripheral.rssi());

    Serial.println();
  }
}

I am doing scanForName to minimise the amount of peripheral discovered, and for sure there are no bunch of devices with this name nearby, but it still crashes (at the very start in most cases, but can can go down after a while as well).
It seems to me like a memory issue (either RAM or Stack Overflow) but 33 NANO is not AVR, it has a lot of it...
Can you give me a hand with where to go from here?
Since there is no additional hardware required to reproduce the problem- you should be able to see what I am seeing.
BLE Library, u8g2 and Mbed Core are all up to date.
Thanks

Another update:
I used some Mbed memory tools to see what is happening inside

void memStat(){
  mbed_stats_heap_t heap_stats;
  mbed_stats_heap_get(&heap_stats);

  Serial.print("Heap size: ");
  Serial.print(heap_stats.current_size);
  Serial.print("/");
  Serial.println(heap_stats.reserved_size);

  int cnt = osThreadGetCount();
  mbed_stats_stack_t *stats = (mbed_stats_stack_t*) malloc(cnt * sizeof(mbed_stats_stack_t));
 
    if (stats) {
        cnt = mbed_stats_stack_get_each(stats, cnt);
        for (int i = 0; i < cnt; i++) {
            Serial.print("Stack size thr");
            Serial.print(stats[i].thread_id);
            Serial.print(": ");
            Serial.print(stats[i].max_size);
            Serial.print("/");
            Serial.println(stats[i].reserved_size);
        }
        free(stats);
    }
}

Here is the result (last info bard sent before a crash):

19:21:09.026 -> Heap size: 19248/186424
19:21:09.026 -> Stack size thr536941612: 624/32768
19:21:09.026 -> Stack size thr536941980: 424/512
19:21:09.026 -> Stack size thr536879632: 392/1024
19:21:09.064 -> Stack size thr536941912: 104/768
19:21:09.064 -> Stack size thr536883712: 168/1024

Second thread looks a bit dangerous being much closer then others to it's limit.
But the most tricky thing- is that removing any references to u8g2 (that's what making board perfectly stable) does not affect any of those threads stats. Thread two is still around 420/512

And here is the crash dump I've read from Serial1 with another board:

00:58:57.361 -> 
00:58:57.361 -> ++ MbedOS Fault Handler ++
00:58:57.361 -> 
00:58:57.361 -> FaultType: HardFault
00:58:57.361 -> 
00:58:57.361 -> Context:
00:58:57.361 -> R0   : 20001670
00:58:57.361 -> R1   : 0000000A
00:58:57.361 -> R2   : 00000215
00:58:57.361 -> R3   : 73552F20
00:58:57.361 -> R4   : 20001450
00:58:57.361 -> R5   : 20001670
00:58:57.361 -> R6   : 00000000
00:58:57.361 -> R7   : 000578E5
00:58:57.361 -> R8   : 000578D2
00:58:57.361 -> R9   : 00000000
00:58:57.361 -> R10  : 00000000
00:58:57.361 -> R11  : 00000000
00:58:57.361 -> R12  : 0004AC41
00:58:57.361 -> SP   : 20010D50
00:58:57.361 -> LR   : 0001313D
00:58:57.361 -> PC   : 73552F20
00:58:57.361 -> xPSR : 800F0000
00:58:57.398 -> PSP  : 20010D30
00:58:57.398 -> MSP  : 2003FFC0
00:58:57.398 -> CPUID: 410FC241
00:58:57.398 -> HFSR : 40000000
00:58:57.398 -> MMFSR: 00000000
00:58:57.398 -> BFSR : 00000001
00:58:57.398 -> UFSR : 00000000
00:58:57.398 -> DFSR : 00000000
00:58:57.398 -> AFSR : 00000000
00:58:57.398 -> Mode : Thread
00:58:57.398 -> Priv : Privileged
00:58:57.398 -> Stack: PSP
00:58:57.398 -> 
00:58:57.398 -> -- MbedOS Fault Handler --
00:58:57.398 -> 
00:58:57.398 -> 
00:58:57.398 -> 
00:58:57.398 -> ++ MbedOS Error Info ++
00:58:57.398 -> Error Status: 0x80FF013D Code: 317 Module: 255
00:58:57.398 -> Error Message: Fault exception
00:58:57.398 -> Location: 0x73552F20
00:58:57.398 -> Error Value: 0x200087D4
00:58:57.398 -> Current Thread: main Id: 0x20010E04 Entry: 0x49833 StackSize: 0x8000 StackMem: 0x20008DE0 SP: 0x20010D50 
00:58:57.435 -> For more info, visit: https://mbed.com/s/error?error=0x80FF013D&tgt=ARDUINO_NANO33BLE
00:58:57.435 -> -- MbedOS Error Info --

Seems that neither a heap nor stack are not the causes of all that...

Still frustrated and still looking for a ways to deal with that...

Sorry I don't have a solution, but I read this thread with interest because I have a similar issue.

My project with u8g2 library was working fine, then I added the BLE library, and just BLE begin() is enough to make the whole thing fall over.

I am using BLE, just like you, to see if a device exists, I don't need any data exchange.

I will try a sketch with no u8g2, and see if I get BLE working.

Hi t6jay,

Yeap, I also have no luck on solving it
It seems like mbed crashes do strongly depend on loop() cycle duration
I got system more stable (still not perfect though) by using adafruit gfx lib and crancking I2C speed to 800,000 hz
I was also curious if it has something to do with I2C itself, but trying u8g2 with soft I2C did not solve the issue and made things even worse crashing system almost immediately (that kindof confirms the issue with loop() time).

I really gave up and now implementing solution of adding intermediate "display controller". It is Arduino Pro Micro receiving values to display over UART, and doing all the u8g2 routine on its own.

My main problem is hardware, it's not that easy to replace Nano 33 BLE with anything else, it is already installed into device, and only thing I can easily do- is to update software. So I turned what used to be I2C into additional Serial like that

UART displaySerial(digitalPinToPinName(A4), digitalPinToPinName(A5), NC, NC);   // A4 = TX, A5 = RX

Those lines are connected to Pro Micro RX and TX
Replacing the whole u8g2 lib with simple serial string sending dramatically decreased loop() time and board seems to be now stable. But more tests yet to come, I'll post an update here once I have more info.
This setup is capable of making around 20-25 FPS, which is good enough for me

I wish you a good luck, but maybe it is worth a shot to try other board (maybe Nano 33 IOT as Klaus_K suggested earlier)

UPD:
Out of curiosity I've made a test of loop cycle duration in both cases under equal conditions:
U8g2 lib: 21-22 cycles per second
Adafuit GFX: 22-23 cycles per second
No libs: 8628-7022 cycles per second
It is 400 times faster now.... Just... wooow

But...
If you only use U8g2 .begin() with no display update or anything else it is doing 15,500 cycles a second which is our new record :slight_smile:
Butt... you've guessed it- it crashes
So it is really unclear what is the cause of Mbed failure

Again, I'll make more in-field tests of my new setup and get back to you with results

I can now confirm it's stable.
I would call it rather walk-around then solution, but still it works...
So we have to choose between BLE and OLED
Quite disappointing since board is pretty powerful, and it would be nice to discover really what is going wrong, but I ran out of ideas.
Maybe some BLE event happening at a time of I2C procedure, maybe some buffer gets overflown while board is busy with. loop execution, or there are some memory allocation issues. Could be anything.
Anyway it definitely does not look like our fault
I'll keep an eye on this thread in case you share some of your findings, t6jay.

That's very disappointing to hear. If I understand you correctly, it's impossible to use both BLE and an OLED display at the same time, on a nano33ble?

Surely this cannot be un-noticed, and go unresolved by Arduino themselves, or whoever writes the cores? Is there some way to report such a flaw? (on github?)

I already have my setup working, with OLED and U8G2, I had hoped to use ble simply to scan for the presence (or absence) of a single BLE peripheral. This peripheral is a battery charger in a campervan, and it's presence tells me that the campervan is connected to the mains electric hook-up, and if so I can present a warning to the driver on the EHU. Now I am faced with adding 2nd nano, or connecting and wiring a relay on the 240V.

That’s what I got, at least I can’t set both BLE and OLED run together. Someone else might be able to do it...
I already created an “Issue” on GitHub around two month ago, but I highly doubt someone really cares about our particular case :frowning:
There are no replies even though I attached as much info on the crash as I could

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.