Deal with UCS2 coded SMS deliver

Hello,
I am doing some tests on the SIM800L GSM modem and I would like to share with you a little difficulty that I encountered.
I work in text mode (AT + CMGF = 1).
• If I send a message with ‘normal’ characters (ASCII 0 ->127) everything is fine. For example if I send the LED_ON message I receive:
+ CMT: “+212 --------”, “”, "21/01 / 01,12: 45: 18 + 04"
LED_ON

• On the other hand, if I send a message with special characters like ç, the SMS deliver does not have the same encoding. For example if I send the message: Comment ça va I receive:

+ CMT: “+212--------”, “”, "21/01 / 01,12: 51: 21 + 04"
0043006F006D006D0065006E0074002000E70061002000760061

• We notice that the body of the message consists of a string constructed with the hexadecimal representation of the UCS2 codes of the characters. 0043 = ‘C’, 006F = ‘o’,…
• It is tricky to print Unicode characters on the serial monitor. After a some tests, I realized that the Arduino-IDE serial monitor works in UTF-8. This means that to display the character β (unicode = 0x3B2) (UTF-8 = 0xCEB2) you must send to the serial monitor the byte 0xCE followed by the byte 0xB2 for example:

  • uint16_t N = 0xCEB2;*
  • Serial.write (highByte (N));*
  • Serial.write (lowByte (N));*

• Attention, we can use: Serial.write ((byte *) &N, 2) but that transmits the LSB first (little Endian). You must first swap the bytes of the UTF-8 code.
• Here is a little code that displays arduino in Arabic

void setup() {
    Serial.begin(9600);
        
    uint16_t U;    
    char SS[] = "06720631062F0648064A06460648";
    int n = strlen(SS);
    char S[5];  // 4 digits + \0
    
    for(int i = 0; i < n ; i+=4){
        strncpy(S, &SS[i], 4);
        U = strtoul(S,NULL,16);
        unicode2utf8(U,1);  //inversé [L H]
        Serial.write((byte*)&U,2); // Little Endian
    } 
}

void loop() {
    // put your main code here, to run repeatedly:

}
void unicode2utf8(uint16_t& U,int r){
    // pour points de codes 0 --> u+07FF
    if(U > 128){
        uint8_t UL = ( (U & 0x00FF)| 0B10000000 ) & 0B10111111;
        uint8_t UH = (U >> 6) | 0B11000000;
        if(r)   U = (UL<<8) | UH;  // [UL UH]
        else    U = (UH<<8) | UL;  // [UH UL]
    }
}