Playing microphone analog input into i2s

Hello,

I'm trying to play a microphone analog input into my IS2 earphones using my ESP32, but something is incorrect in my code.

I'm a bit confused into the conversion of input / output. I hear very small high pitch sounds once in a while in the earphones but that's it. I can confirm the pin configuration is correct.

#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include "driver/i2s_std.h"
#include "esp_adc/adc_continuous.h"
#include "esp_err.h"
#include "esp_log.h"
#include "esp_system.h"
#include "freertos/FreeRTOS.h"
#include "freertos/ringbuf.h"

adc_continuous_handle_t handle = NULL;
i2s_chan_handle_t tx_chan = NULL;

#define BUFFER_SIZE 512        // Microphone buffer size
#define SAMPLE_RATE 20000      // Speaker sample rate
#define ADC_SAMPLE_RATE 20000  // Microphone sample rate

static const char *TAG = "AudioLoop";

void init_microphone() {
  adc_continuous_handle_cfg_t adc_config = {
      .max_store_buf_size = 4 * BUFFER_SIZE,
      .conv_frame_size = BUFFER_SIZE,
  };

  ESP_ERROR_CHECK(adc_continuous_new_handle(&adc_config, &handle));

  adc_continuous_config_t dig_cfg = {
      .sample_freq_hz = ADC_SAMPLE_RATE,
      .conv_mode = ADC_CONV_SINGLE_UNIT_1,
      .format = ADC_DIGI_OUTPUT_FORMAT_TYPE1,
      .pattern_num = 1,
  };

  adc_digi_pattern_config_t adc_pattern = {
      .atten = ADC_ATTEN_DB_0,
      .channel = ADC_CHANNEL_6,  // GPIO 34
      .unit = ADC_UNIT_1,
      .bit_width = ADC_BITWIDTH_12,
  };

  dig_cfg.adc_pattern = &adc_pattern;

  ESP_ERROR_CHECK(adc_continuous_config(handle, &dig_cfg));
  ESP_ERROR_CHECK(adc_continuous_start(handle));
}

void init_speaker() {
  i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_NUM_1, I2S_ROLE_MASTER);
  chan_cfg.auto_clear = true;
  i2s_std_config_t std_cfg = {
      .clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(SAMPLE_RATE),
      .slot_cfg = I2S_STD_MSB_SLOT_DEFAULT_CONFIG(I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_MONO),
      .gpio_cfg = {
          .mclk = I2S_GPIO_UNUSED,
          .bclk = 26,
          .ws = 25,
          .dout = 22,
          .din = I2S_GPIO_UNUSED,
          .invert_flags = {
              .mclk_inv = false,
              .bclk_inv = false,
              .ws_inv = false,
          },
      },
  };

  ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, &tx_chan, NULL));
  ESP_ERROR_CHECK(i2s_channel_init_std_mode(tx_chan, &std_cfg));
  ESP_ERROR_CHECK(i2s_channel_enable(tx_chan));
}

void app_main() {
  ESP_LOGI(TAG, "Initializing audio loop");

  // Initialize microphone and speaker
  init_microphone();
  init_speaker();

  uint8_t mic_in[BUFFER_SIZE];
  uint32_t mic_bytes_read;
  int16_t speaker_out[BUFFER_SIZE];

  while (1) {
    ESP_ERROR_CHECK(adc_continuous_read(handle, mic_in, BUFFER_SIZE, &mic_bytes_read, portMAX_DELAY));

    for (size_t i = 0; i < mic_bytes_read; i++) {
      adc_digi_output_data_t *p = (adc_digi_output_data_t *)&mic_in[i * sizeof(adc_digi_output_data_t)];
      speaker_out[i] = p->type1.data;
      ESP_LOGI(TAG, "data = %" PRId16 "", p->type1.data);
    }

    size_t bytes_written;
    ESP_ERROR_CHECK(i2s_channel_write(tx_chan, speaker_out, BUFFER_SIZE, &bytes_written, portMAX_DELAY));
  }
}

Thanks in advance if you can help out !

Does it compile?
Confirm this is NOT chatGPT code.

You probably have an ESP32 between the two. Show the wiring diagram (as always).

Hello :slight_smile: it compiles and flashes just fine. Not generated by chatGPT nope. I simplified a lot the ADC continuous example from esp IDF: esp-idf/examples/peripherals/adc/continuous_read/main/continuous_read_main.c at master · espressif/esp-idf · GitHub

The cast is a bit strange IMO. The ringbuffer receive method expects a uint8 type but it looks like we just know it actually points to something else.

I can hear some sound in my earphones plugged in my I2S codec, but it's definitely not the right sound.

Maybe it's an issue converting from 16bits to 8 bits array. I enabled the option forcing the ESP to do the ADC conversion from 12 to 16 bits.

As for the wiring, for the I2S it works perfectly. I have an A2DP app playing music just fine on my earphones. The configuration in the example is the same as my A2DP one.
PCM5102 I2S codec
GND - GND
3V3 - 3V3
LRCK pin 25 (WS)
DATA pin 22 (dout)
SCK pin 26 (bclk)

and the microphone is a KY-038

  • GNG - GNG
      • 3V3
  • A0 - GPIO 34 (ADC 1 CH 6)

That is a sound sensor. Not a microphone.

You're right. I've just changed it to a MAX9814. Wired Vdd to 3.3V, GND to GND, and Dout to GPIO 34 (ADC 1 channel 6).

This code does not echo anything:

#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include "driver/i2s_std.h"
#include "esp_adc/adc_continuous.h"
#include "esp_err.h"
#include "esp_log.h"
#include "esp_system.h"
#include "freertos/FreeRTOS.h"
#include "freertos/ringbuf.h"

adc_continuous_handle_t handle = NULL;
i2s_chan_handle_t tx_chan = NULL;

#define BUFFER_SIZE 320        // Microphone buffer size (32kHz / 100ms)
#define SAMPLE_RATE 16000      // Speaker sample rate
#define ADC_SAMPLE_RATE 32000  // Microphone sample rate

static const char *TAG = "AudioLoop";

void init_microphone() {
  adc_continuous_handle_cfg_t adc_config = {
      .max_store_buf_size = 4 * BUFFER_SIZE,
      .conv_frame_size = BUFFER_SIZE,
  };

  ESP_ERROR_CHECK(adc_continuous_new_handle(&adc_config, &handle));

  adc_continuous_config_t dig_cfg = {
      .sample_freq_hz = ADC_SAMPLE_RATE,
      .conv_mode = ADC_CONV_SINGLE_UNIT_1,
      .format = ADC_DIGI_OUTPUT_FORMAT_TYPE1,
      .pattern_num = 1,
  };

  adc_digi_pattern_config_t adc_pattern = {
      .atten = ADC_ATTEN_DB_0,
      .channel = ADC_CHANNEL_6,  // GPIO 34
      .unit = ADC_UNIT_1,
      .bit_width = ADC_BITWIDTH_12,
  };

  dig_cfg.adc_pattern = &adc_pattern;

  ESP_ERROR_CHECK(adc_continuous_config(handle, &dig_cfg));
  ESP_ERROR_CHECK(adc_continuous_start(handle));
}

void init_speaker() {
  i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_NUM_1, I2S_ROLE_MASTER);
  chan_cfg.auto_clear = true;
  i2s_std_config_t std_cfg = {
      .clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(SAMPLE_RATE),
      .slot_cfg = I2S_STD_MSB_SLOT_DEFAULT_CONFIG(I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_MONO),
      .gpio_cfg = {
          .mclk = I2S_GPIO_UNUSED,
          .bclk = 26,
          .ws = 25,
          .dout = 22,
          .din = I2S_GPIO_UNUSED,
          .invert_flags = {
              .mclk_inv = false,
              .bclk_inv = false,
              .ws_inv = false,
          },
      },
  };

  ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, &tx_chan, NULL));
  ESP_ERROR_CHECK(i2s_channel_init_std_mode(tx_chan, &std_cfg));
  ESP_ERROR_CHECK(i2s_channel_enable(tx_chan));
}

void app_main() {
  init_microphone();
  init_speaker();

  uint8_t mic_in[BUFFER_SIZE];           // Raw microphone buffer
  int16_t downsampled[BUFFER_SIZE / 4];  // Downsampled and scaled buffer
  uint32_t mic_length_read;

  while (1) {
    // Read microphone data
    ESP_ERROR_CHECK(adc_continuous_read(handle, mic_in, BUFFER_SIZE, &mic_length_read, portMAX_DELAY));

    // Downsample and convert to 16-bit signed format
    size_t downsampled_index = 0;
    for (size_t i = 0; i < mic_length_read / sizeof(adc_digi_output_data_t); i += 2) {
      adc_digi_output_data_t *p = (adc_digi_output_data_t *)&mic_in[i * sizeof(adc_digi_output_data_t)];
      uint16_t raw_data = p->type1.data;

      // Convert 12-bit ADC data to signed 16-bit format
      downsampled[downsampled_index++] = (int16_t)((raw_data - 2048) << 4);  // Center at 0 and scale to 16-bit
    }

    // Write downsampled data to the speaker
    size_t bytes_written;
    ESP_ERROR_CHECK(i2s_channel_write(tx_chan, downsampled, downsampled_index * sizeof(int16_t), &bytes_written, portMAX_DELAY));
  }
}

Is that good or bad?

That's bad, when I speak, it should echo my voice in the earphones plugged into the PCM5102.

I've made progress. I can hear sound when I speak, or do any noise, but the sound is super distorted and doesn't resemble anything. I must have something wrong with timing, frame size or whatever.

#include <stdio.h>
#include <string.h>

#include "driver/i2s_std.h"
#include "esp_adc/adc_continuous.h"
#include "esp_err.h"
#include "esp_log.h"
#include "freertos/FreeRTOS.h"
#include "freertos/semphr.h"
#include "freertos/task.h"
#include "sdkconfig.h"

#define EXAMPLE_READ_LEN 256

i2s_chan_handle_t tx_chan = NULL;
static adc_channel_t channel[1] = {ADC_CHANNEL_7};

static TaskHandle_t s_task_handle;
static const char *TAG = "EXAMPLE";

static void i2s_init() {
  i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_NUM_1, I2S_ROLE_MASTER);
  chan_cfg.auto_clear = true;
  i2s_std_config_t std_cfg = {
      .clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(20000),
      .slot_cfg = I2S_STD_MSB_SLOT_DEFAULT_CONFIG(I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_MONO),
      .gpio_cfg = {
          .mclk = I2S_GPIO_UNUSED,
          .bclk = 26,
          .ws = 25,
          .dout = 32,
          .din = I2S_GPIO_UNUSED,
          .invert_flags = {
              .mclk_inv = false,
              .bclk_inv = false,
              .ws_inv = false,
          },
      },
  };

  ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, &tx_chan, NULL));
  ESP_ERROR_CHECK(i2s_channel_init_std_mode(tx_chan, &std_cfg));
  ESP_ERROR_CHECK(i2s_channel_enable(tx_chan));
}

static bool IRAM_ATTR s_conv_done_cb(adc_continuous_handle_t handle, const adc_continuous_evt_data_t *edata, void *user_data) {
  BaseType_t mustYield = pdFALSE;
  // Notify that ADC continuous driver has done enough number of conversions
  vTaskNotifyGiveFromISR(s_task_handle, &mustYield);

  return (mustYield == pdTRUE);
}

static void continuous_adc_init(adc_channel_t *channel, uint8_t channel_num, adc_continuous_handle_t *out_handle) {
  adc_continuous_handle_t handle = NULL;

  adc_continuous_handle_cfg_t adc_config = {
      .max_store_buf_size = 10 * 1024,
      .conv_frame_size = EXAMPLE_READ_LEN,
  };
  ESP_ERROR_CHECK(adc_continuous_new_handle(&adc_config, &handle));

  adc_continuous_config_t dig_cfg = {
      .sample_freq_hz = 20000,
      .conv_mode = ADC_CONV_SINGLE_UNIT_1,
      .format = ADC_DIGI_OUTPUT_FORMAT_TYPE1,
  };

  adc_digi_pattern_config_t adc_pattern[SOC_ADC_PATT_LEN_MAX] = {0};
  dig_cfg.pattern_num = channel_num;
  for (int i = 0; i < channel_num; i++) {
    adc_pattern[i].atten = ADC_ATTEN_DB_0;
    adc_pattern[i].channel = channel[i] & 0x7;
    adc_pattern[i].unit = ADC_UNIT_1;
    adc_pattern[i].bit_width = SOC_ADC_DIGI_MAX_BITWIDTH;

    ESP_LOGI(TAG, "adc_pattern[%d].atten is :%" PRIx8, i, adc_pattern[i].atten);
    ESP_LOGI(TAG, "adc_pattern[%d].channel is :%" PRIx8, i, adc_pattern[i].channel);
    ESP_LOGI(TAG, "adc_pattern[%d].unit is :%" PRIx8, i, adc_pattern[i].unit);
  }
  dig_cfg.adc_pattern = adc_pattern;
  ESP_ERROR_CHECK(adc_continuous_config(handle, &dig_cfg));

  *out_handle = handle;
}

void app_main(void) {
  i2s_init();

  esp_err_t ret;
  uint32_t ret_num = 0;
  size_t speaker_bytes_written;
  uint8_t result[EXAMPLE_READ_LEN] __attribute__((aligned(4))) = {0};
  uint8_t data[EXAMPLE_READ_LEN / SOC_ADC_DIGI_RESULT_BYTES * 2] = {0};  // Buffer for I2S data
  memset(result, 0xcc, EXAMPLE_READ_LEN);

  s_task_handle = xTaskGetCurrentTaskHandle();

  adc_continuous_handle_t handle = NULL;
  continuous_adc_init(channel, sizeof(channel) / sizeof(adc_channel_t), &handle);

  adc_continuous_evt_cbs_t cbs = {
      .on_conv_done = s_conv_done_cb,
  };
  ESP_ERROR_CHECK(adc_continuous_register_event_callbacks(handle, &cbs, NULL));
  ESP_ERROR_CHECK(adc_continuous_start(handle));

  while (1) {
    ulTaskNotifyTake(pdTRUE, portMAX_DELAY);

    ret = adc_continuous_read(handle, result, EXAMPLE_READ_LEN, &ret_num, 0);
    if (ret != ESP_OK) {
      continue;
    }

    // Convert ADC samples to byte array for I2S
    for (int i = 0; i < ret_num; i += SOC_ADC_DIGI_RESULT_BYTES) {
      adc_digi_output_data_t *p = (adc_digi_output_data_t *)&result[i];
      uint16_t adc = p->type1.data;
      int16_t sample = ((int16_t)adc - 2048) << 4;  // Convert to signed 16-bit PCM

      // Split sample into two bytes and store in the data array
      data[i] = sample & 0xFF;             // Lower byte
      data[i + 1] = (sample >> 8) & 0xFF;  // Upper byte
    }

    // Write the byte array to the I2S speaker
    i2s_channel_write(tx_chan, data, ret_num * 2 / SOC_ADC_DIGI_RESULT_BYTES, &speaker_bytes_written, portMAX_DELAY);

    vTaskDelay(1);
  }

  ESP_ERROR_CHECK(adc_continuous_stop(handle));
  ESP_ERROR_CHECK(adc_continuous_deinit(handle));
}

I can't be much help but..

You're read (ADC) and write (IS2) sample rates have to match.

The bit-depth also has to match. The ADC is 12-bits and you are writing 16-bits. 12-bits of audio in 16-bit samples will be at -48dB (super quiet), assuming all of the bits are organized/aligned correctly and that's the only problem.

Normal audio is AC. It goes positive and negative and "normal" 16-bit audio uses signed integers. You'll have to subtract-out the bias (from the sound sensor) which should be 2046 with the 12-bit ADC. That also means "near silence" should be giving you 12-bit readings around 2046. (The numbers will jump-around. You'll never get pure silence with analog.)

1 Like

This is very interesting, thanks so much for the input! As you see in my code, I center the 12bit data to 0 by subtracting 2048 and then shift it 4 bits to the left to get the 16 bit data.
When I pass the array to the I2S driver, I spread each 16bit sample I into two bytes, as from my understanding of Bluetooth HFP examples from Espressif, the I2S channel write function expects an array of bytes and not integers.

As for the sample rates, I've chosen 20kHz for both in the code above.

I heard about the Nyquist frequency but I'm not sure at this point it's worth frying to enhance the audio, as what I hear is just chatter.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.