How to Perform Object Detection Directly on ESP32-CAM?

usernamear1s · December 12, 2024, 4:49pm

Hi everyone,

I am working on a project where I want to stream video and perform object detection directly on the ESP32-CAM without relying on an external PC. I understand that the ESP32-CAM has limited hardware resources, so I'm looking for the best way to achieve this.

Here are my questions:

Are there any lightweight object detection models (e.g., TinyML models) that can run directly on the ESP32-CAM?
What tools or frameworks should I use to deploy and run object detection models on the ESP32-CAM? I've read about TensorFlow Lite for Microcontrollers, but I'm unsure if it supports this setup.
Is it possible to run pre-trained models like TinyYOLO on the ESP32-CAM, or do I need to train a custom model optimized for microcontrollers?
Any examples, libraries, or guides to help me set up this pipeline?

I’m particularly interested in detecting simple objects like “person” or “car.” Any advice on achieving this directly on the ESP32-CAM would be greatly appreciated.

Thanks in advance for your help!

Paul_KD7HB · December 12, 2024, 6:25pm

Based on your research, does your ESP-32-CAM have enough memory to store one uncompressed image so it can be analyzed?

srnet · December 12, 2024, 7:06pm

There are ESP32S3 Dev boards that have a camera connector and SD card holder and 8MB of PSRAM.

usernamear1s · December 12, 2024, 9:42pm

The ESP32-CAM has limited memory, with 520 KB of SRAM and, in some models, an additional 4 MB of external PSRAM. It can store small uncompressed images like 320x240 (QVGA), which require around 230 KB of memory, but larger resolutions (e.g., 640x480) will exceed the SRAM capacity unless PSRAM is enabled.
For analysis, most ESP32-CAM setups rely on compressed JPEG images to save memory, decompressing them only if necessary for processing.

Paul_KD7HB · December 12, 2024, 9:43pm

Right, and your necessary processing is object detection. So your ESP will have to decompress the image to examine each pixel.

usernamear1s · December 12, 2024, 9:58pm

i can put a MicroSD Card * Pre-trained models or lightweight neural network files. Object detection requires computational power and RAM. The ESP32-CAM can only handle very lightweight models

usernamear1s · December 12, 2024, 10:04pm

I just thought of Using a cloud service for object detection with the ESP32-CAM involves leveraging the computational power of cloud-based platforms to perform tasks that the ESP32-CAM itself cannot handle due to hardware limitations. does anyone know how to do this ?

gfvalvo · December 12, 2024, 10:57pm

You might get some ideas from studying the example ESP32-CAM code that does face detection.

jremington · December 12, 2024, 11:26pm

The first step would be to check if you can actually stream video on the ESP32-CAM.

If you haven't done that yet, the results might be disappointing. Depending, of course, on how you define "stream video".

Brazilino · December 12, 2024, 11:31pm

Take a look at https://dronebotworkshop.com/esp32-object-detect/

system · June 10, 2025, 11:31pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ESP32-CAM not showing or takes too long General Guidance	4	153	December 17, 2025
Object Detection Not Working with ESP32-CAM Hardware Development	8	1891	October 13, 2023
Error Compiling Edge Impulse FOMO for ESP32-CAM Programming	6	348	July 1, 2025
Prototype needed, looking for ESP32 S3 + cam & facial presence confirmation Jobs and Paid Consultancy	9	124	January 31, 2026
Esp32 Cam Accessing without going through WebServer General Guidance	7	820	November 21, 2024

How to Perform Object Detection Directly on ESP32-CAM?

Related topics