I am working on a project where I want to stream video and perform object detection directly on the ESP32-CAM without relying on an external PC. I understand that the ESP32-CAM has limited hardware resources, so I'm looking for the best way to achieve this.
Here are my questions:
Are there any lightweight object detection models (e.g., TinyML models) that can run directly on the ESP32-CAM?
What tools or frameworks should I use to deploy and run object detection models on the ESP32-CAM? I've read about TensorFlow Lite for Microcontrollers, but I'm unsure if it supports this setup.
Is it possible to run pre-trained models like TinyYOLO on the ESP32-CAM, or do I need to train a custom model optimized for microcontrollers?
Any examples, libraries, or guides to help me set up this pipeline?
I’m particularly interested in detecting simple objects like “person” or “car.” Any advice on achieving this directly on the ESP32-CAM would be greatly appreciated.
The ESP32-CAM has limited memory, with 520 KB of SRAM and, in some models, an additional 4 MB of external PSRAM. It can store small uncompressed images like 320x240 (QVGA), which require around 230 KB of memory, but larger resolutions (e.g., 640x480) will exceed the SRAM capacity unless PSRAM is enabled.
For analysis, most ESP32-CAM setups rely on compressed JPEG images to save memory, decompressing them only if necessary for processing.
i can put a MicroSD Card * Pre-trained models or lightweight neural network files. Object detection requires computational power and RAM. The ESP32-CAM can only handle very lightweight models
I just thought of Using a cloud service for object detection with the ESP32-CAM involves leveraging the computational power of cloud-based platforms to perform tasks that the ESP32-CAM itself cannot handle due to hardware limitations. does anyone know how to do this ?