About ESP32-CAM image capture byte data

Hello, everyone
I am trying to capture the image from ESP32-CAM and send the image data to another module through wifi.
To test this, the attached code was used and a program called “Socket test” was used to send image data to the computer sent to UDP packets sent by ESP32-CAM.

If you look at the attached picture, the computer is receiving data from ESP32-CAM, but I don’t know if this is image data.

Because we changed PIXEL_FORMAT from config of ESP32-CAM to GRAYSCALE, and if pixel data is over, I think hexadecimal data such as AA, 43, 65, B3, 9C, etc. should be over.
However, if you look at the attached picture, you can see data with very strange shapes. This data looks similar to ASCII code, but it seems to be part of it.

To summarize, is the data received through the Socket Test correct image data?

For your information, I need raw data indicating GRAYSCALE.

Please help me. Thank you.

#include "esp_camera.h"
#include <WiFi.h>
#include <WiFiUDP.h>
//
// WARNING!!! PSRAM IC required for UXGA resolution and high JPEG quality
//            Ensure ESP32 Wrover Module or other board with PSRAM is selected
//            Partial images will be transmitted if image exceeds buffer size
//

// Select camera model
//#define CAMERA_MODEL_WROVER_KIT // Has PSRAM
//#define CAMERA_MODEL_ESP_EYE // Has PSRAM
//#define CAMERA_MODEL_M5STACK_PSRAM // Has PSRAM
//#define CAMERA_MODEL_M5STACK_V2_PSRAM // M5Camera version B Has PSRAM
//#define CAMERA_MODEL_M5STACK_WIDE // Has PSRAM
//#define CAMERA_MODEL_M5STACK_ESP32CAM // No PSRAM
#define CAMERA_MODEL_AI_THINKER // Has PSRAM
//#define CAMERA_MODEL_TTGO_T_JOURNAL // No PSRAM

#include "camera_pins.h"

const char* ssid = "Waitinglee";
const char* password = "27179083";
unsigned int localport = 2390;
void startCameraServer();


  WiFiUDP Udp;

void setup() {
  Serial.begin(115200);
  Serial.setDebugOutput(true);
  Serial.println();

  camera_config_t config;
  config.ledc_channel = LEDC_CHANNEL_0;
  config.ledc_timer = LEDC_TIMER_0;
  config.pin_d0 = Y2_GPIO_NUM;
  config.pin_d1 = Y3_GPIO_NUM;
  config.pin_d2 = Y4_GPIO_NUM;
  config.pin_d3 = Y5_GPIO_NUM;
  config.pin_d4 = Y6_GPIO_NUM;
  config.pin_d5 = Y7_GPIO_NUM;
  config.pin_d6 = Y8_GPIO_NUM;
  config.pin_d7 = Y9_GPIO_NUM;
  config.pin_xclk = XCLK_GPIO_NUM;
  config.pin_pclk = PCLK_GPIO_NUM;
  config.pin_vsync = VSYNC_GPIO_NUM;
  config.pin_href = HREF_GPIO_NUM;
  config.pin_sscb_sda = SIOD_GPIO_NUM;
  config.pin_sscb_scl = SIOC_GPIO_NUM;
  config.pin_pwdn = PWDN_GPIO_NUM;
  config.pin_reset = RESET_GPIO_NUM;
  config.xclk_freq_hz = 20000000;
  config.pixel_format = PIXFORMAT_GRAYSCALE;
  
  // if PSRAM IC present, init with UXGA resolution and higher JPEG quality
  //                      for larger pre-allocated frame buffer.
  config.frame_size = FRAMESIZE_QVGA;
  config.jpeg_quality = 10;
  config.fb_count = 1;


#if defined(CAMERA_MODEL_ESP_EYE)
  pinMode(13, INPUT_PULLUP);
  pinMode(14, INPUT_PULLUP);
#endif

  // camera init
  esp_err_t err = esp_camera_init(&config);
  if (err != ESP_OK) {
    Serial.printf("Camera init failed with error 0x%x", err);
    return;
  }

  sensor_t * s = esp_camera_sensor_get();
  // initial sensors are flipped vertically and colors are a bit saturated
  if (s->id.PID == OV3660_PID) {
    s->set_vflip(s, 1); // flip it back
    s->set_brightness(s, 1); // up the brightness just a bit
    s->set_saturation(s, -2); // lower the saturation
  }
  // drop down frame size for higher initial frame rate
  s->set_framesize(s, FRAMESIZE_QVGA);

#if defined(CAMERA_MODEL_M5STACK_WIDE) || defined(CAMERA_MODEL_M5STACK_ESP32CAM)
  s->set_vflip(s, 1);
  s->set_hmirror(s, 1);
#endif

  WiFi.begin(ssid, password);

  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
  }
  Serial.println("");
  Serial.println("WiFi connected");

  startCameraServer();

  Serial.print("Camera Ready! Use 'http://");
  Serial.print(WiFi.localIP());
  Serial.println("' to connect");
}

void loop() {
  // put your main code here, to run repeatedly:
  Udp.begin(localport);
  Udp.beginPacket("192.168.229.252",localport);
  Serial.println("Sending UDP packet...");
  camera_fb_t* fb = NULL;
  fb = esp_camera_fb_get();
  Udp.write(fb->buf, fb->len); 
  Serial.println(fb->len);
  Udp.endPacket();
  esp_camera_fb_return(fb);
  delay(10000);  // I will take a picture and send it every 10 seconds.
}

don’t dump what you get in ASCII, you won’t be able to make any sense of it, it’s a binary stream

You should look for 0xFFD8 which denotes the start of a JPEG file and 0xFFD9 will mark the end.
grab those markers and whatever is in between, dump that into a binary .jpg file and try to open it on your desktop

Seems you are using this udp method size_t write(const uint8_t *buffer, size_t size);

Notice the input requirement of const uint8_t *buffer. It seems to me the conversion is wrong.

Try:

Udp.write( (unsigned char *)fb->buf, fb->len ); 

It’s what I do when I FTP an image from the ESP32-CAM.

it does not matter, it’s a byte pointer

and the signature is

size_t write(const uint8_t *buffer, size_t size)

If the binary stream you are talking about is expressed in binary of 8 bits such as 01010101, that is the data I want.
I needed raw data representing an image other than JPEG image compression in GRAYSCALE, that is, data consisting of 8-bit binary with one pixel brightness.

So, is that the data I wanted to look like that?

UDP is a protocol with no guaranteed delivery . If you want to guarantee , you should implement some protocol on top of UDP.

I am aware of the fact. I have no choice but UDP anyway. What’s important to me is to get that UDP packet and send serial communications from Arduino to FPGA in binary. Therefore, in order to know the possibility of this behavior, it is necessary to know whether the data type shown in the attached photo is received in binary form.

You are definitely sending binary data with

Udp.write(fb->buf, fb->len); 

There is no processing done in between so whatever is pointed at in buf will be sent.

The frame buffer does hold the output of the camera encoded in the way you set it up. There might be additional info in the frame buffer on top of the jpg hence my point about looking for 0xFFD8 and 0xFFD9 (that’s what I do for FTP transfer and it works fine, I do get a proper jpg file on the other side)

I was a little late because of the exam. I’m sorry.

I think the output should come out in BMP format because PIXEL Format is set to GRAYSCALE, not JPEG. Therefore, I think it is meaningless to find 0xFFD8 and 0xFFD9. Also, I opened sensor.h, which contains esp_camera.h, and there was a part where I set up PIXELFORMAT, and if I set it to GRAYSCALE, it becomes 1BP/GRAYSCALE.

Is it still compressed to JPG when sending via UDP PACKET even if I set it like this?
And I’m sending data directly to another Arduino board via wifi, do I need FTP?

Here I attach the code of Sensor.h that I saw.

#ifndef __SENSOR_H__
#define __SENSOR_H__
#include <stdint.h>
#include <stdbool.h>

#define NT99141_PID     (0x14)
#define OV9650_PID     (0x96)
#define OV7725_PID     (0x77)
#define OV2640_PID     (0x26)
#define OV3660_PID     (0x36)
#define OV5640_PID     (0x56)
#define OV7670_PID     (0x76)

typedef enum {
    PIXFORMAT_RGB565,    // 2BPP/RGB565
    PIXFORMAT_YUV422,    // 2BPP/YUV422
    PIXFORMAT_GRAYSCALE, // 1BPP/GRAYSCALE
    PIXFORMAT_JPEG,      // JPEG/COMPRESSED
    PIXFORMAT_RGB888,    // 3BPP/RGB888
    PIXFORMAT_RAW,       // RAW
    PIXFORMAT_RGB444,    // 3BP2P/RGB444
    PIXFORMAT_RGB555,    // 3BP2P/RGB555
} pixformat_t;

typedef enum {
    FRAMESIZE_96X96,    // 96x96
    FRAMESIZE_QQVGA,    // 160x120
    FRAMESIZE_QCIF,     // 176x144
    FRAMESIZE_HQVGA,    // 240x176
    FRAMESIZE_240X240,  // 240x240
    FRAMESIZE_QVGA,     // 320x240
    FRAMESIZE_CIF,      // 400x296
    FRAMESIZE_HVGA,     // 480x320
    FRAMESIZE_VGA,      // 640x480
    FRAMESIZE_SVGA,     // 800x600
    FRAMESIZE_XGA,      // 1024x768
    FRAMESIZE_HD,       // 1280x720
    FRAMESIZE_SXGA,     // 1280x1024
    FRAMESIZE_UXGA,     // 1600x1200
    // 3MP Sensors
    FRAMESIZE_FHD,      // 1920x1080
    FRAMESIZE_P_HD,     //  720x1280
    FRAMESIZE_P_3MP,    //  864x1536
    FRAMESIZE_QXGA,     // 2048x1536
    // 5MP Sensors
    FRAMESIZE_QHD,      // 2560x1440
    FRAMESIZE_WQXGA,    // 2560x1600
    FRAMESIZE_P_FHD,    // 1080x1920
    FRAMESIZE_QSXGA,    // 2560x1920
    FRAMESIZE_INVALID
} framesize_t;

Right, I had missed that.

Have you tried sending to serial the frame buffer to a serial monitor accepting binary input and saving directly to a file to double check you do get a BMP?

I will receive the image as UDP Packet by Arduino module and send the binary image data (BMP) to FPGA (I don’t know if you know it) via serial communication. Since I am going to perform the Image Enhancement algorithm through FPGA, we were going to continue to receive image data as binary input.

I meant extracting the binary frame buffer as a file just for ensuring the format is what you expect

Oh, I got it.
I had ESP32-CAM send the frame buffer to my laptop via wifi. After that, I checked the frame buffer data that has been sent as UDP packets through the Socket Test program on my laptop.
However, the format did not come out as expected. If you look at the attached photo file, you can see that strange non-ASCII characters are printed out.

So I just tried again with a program called PacketSender, and the data format I wanted came out. It’s a hexadecimal format.
(I have attached a photo.)

What I don’t know is that I want to see the image data I received in pictures, but I don’t know how.

The frame buffer is binary data, not ascii… if you try to just print the frame buffer of course you won’t see ASCII characters… you need to write a dump program that will output the bytes in decimal or hex or binary is you want to see what’s going on.

Personally I can’t decode a BMP file mentally by looking at the bits and bytes… so that’s why I was suggesting to dump the bytes into a file, name that file test.bmp and double click on it on your PC / Mac to see if it opens fine…

That’s right, but when unpacking UDP packets, you get char type. So I know that what is printed on the screen is printed in ASCII.

As you said, I transferred the hexadecimal data to a notepad and saved it as .bmp, but the file did not open. Is this the right way to save it?

Below is the method I saved.

You get bytes - you do what you want with them… just cast to an uint8_t and print in HEX

Seems your unpacking is now ok but I can’t say if it’s a proper bmp

That’s right, passing these data to the FPGA is something I need to try more, but I want to make sure that these bytes of data are appropriate bmps.

As you said, I wrote down the bytes in the notepad (picture attached above), saved it, designated it as .bmp, and checked if the file was opened, but it didn’t open. Is there any way to check?

You need to save the binary into a file, not the ascii representation

As you said, I converted all the hexadecimal data to binary and saved it in my notebook (as shown in the attached picture) but it never opened.