Go Down

Topic: Bug with retrieving SMSs in SimCom SIM7000A module (Read 339 times) previous topic - next topic


So this is kind of a bummer, there seems to be an SMS retrieval bug in hardware or firmware in the SIM7000A (module specs below).

Basically, when you retrieve an SMS, it shows lots of info including the # of chars to read, and in some cases this number is incorrect.  I just ran these commands in the Arduino IDE Serial monitor using a Serial connection and AT commands.

The last number, "36" here, is the # of chars in the SMS message.  So all alphanumerics work:
Code: [Select]
+CMGR: "REC READ","+15127500974",,"20/01/25,20:17:04+00",145,4,0,0,"+447797704131",145,36

All "common" symbols work.  "Common" means all the symbols on the "alternate" keyboard on my Motorola Z3 Play:
Code: [Select]
+CMGR: "REC READ","+15127500974",,"20/01/25,20:17:44+00",145,4,0,0,"+447797704131",145,19

Here's where it gets weird, which I suspect is part of the issue.  Here's the text message "abc<pounds sterling symbol>"
But you'll notice that it is indeed the right number of characters: 4
The charcters are in hexadecimal, and I'm not sure what character set it is, but 0061 is "a" in most sets.  Since I'm using the Arduino IDE's serial monitor, God knows it's possible some translation it's doing to try to show me these characters.  I don't know, don't want to.  Let's not go down that route in this thread.
Code: [Select]
+CMGR: "REC UNREAD","+15127500974",,"20/01/25,21:01:52+00",145,4,0,8,"+447797704131",145,4

Now here are the examples of when you combine alphanums with "uncommon" characters.  The above message with "abc<pounds sterling symbol>" I can understand having issues, because <pounds sterling> isn't part of the first 128 chars in most character sets, but <close curly brace> is, and it causes this problem:

Correct, 3 characters:
Code: [Select]
+CMGR: "REC UNREAD","+15127500974",,"20/01/25,21:11:26+00",145,4,0,0,"+447797704131",145,3

Incorrect, also 3 characters but 4 reported:
Code: [Select]
+CMGR: "REC UNREAD","+15127500974",,"20/01/25,21:11:28+00",145,4,0,0,"+447797704131",145,4

Tilde symbol also exhibits the behavior.  This example shows that there is not just 1 extra character counted, but 1 extra character counted PER "uncommon" character:
Code: [Select]
+CMGR: "REC UNREAD","+15127500974",,"20/01/25,21:19:26+00",145,0,0,0,"+447797704131",145,6

So what happens is you try to read N characters from the next line and you get extra characters, specifically '\n' and '\r' over and over.
Here are the characters and the hexadecimal representations of them.  61 = 'a' 7D = <close curly brace> D = '\r' and A = '\n'
Code: [Select]
+CMGR: "REC READ","+15127500974",,"20/01/25,21:43:14+00",145,4,0,0,"+447797704131",145,4
61 61 61 61

+CMGR: "REC READ","+15127500974",,"20/01/25,21:43:26+00",145,4,0,0,"+447797704131",145,5
61 61 7D 61 D

+CMGR: "REC READ","+15127500974",,"20/01/25,21:43:38+00",145,4,0,0,"+447797704131",145,6
61 7D 7D 61 D A

+CMGR: "REC UNREAD","+15127500974",,"20/01/25,21:43:46+00",145,4,0,0,"+447797704131",145,6
61 7D 61 7D D A

+CMGR: "REC UNREAD","+15127500974",,"20/01/25,21:46:00+00",145,4,0,0,"+447797704131",145,8
7D 7D 7D 7D D A D A

Some now I'll wait for someone to tell me this is working as designed.  The Adafruit FONA library and the Botletics library (builds on FONA) are both victims of this behavior.  It may be something that is inherent to the chip design and therefore not a true bug, but it's a bug far as I'm concerned.  I'm a software guy and this EE stuff is for the birds.  Sorry for the rant but it took me about 8 hours to find the root cause.


Module specs:
ATI: SIM7000A R1351
AT+GMR: Revision:1351B03SIM7000A   // firmware version B03.  I think B04 is available

AT command settings:
AT+CMGF: 1     // Text mode (not PDU mode)
AT+CSDH: 1     // Show all fields from CMGR command
AT+CPMS: ("SM"),("SM"),("SM")  // "SM" = SIM message storage.  It looks like the SIM7000A only supports SIM storage, whereas the SIM808 included "ME" (Phone storage) and others


Jan 26, 2020, 07:39 am Last Edit: Jan 26, 2020, 07:39 am by cattledog
Basically, when you retrieve an SMS, it shows lots of info including the # of chars to read, and in some cases this number is incorrect.
I think you want to look at the character set used. AT+CSCS=?

I think there are two character sets available, GSM (GSM7) and UCS2. Do some reasearch on these two sets.

I think that if you are using GSM there is a mix of one and two byte characters. The } bracket while being a standard ascii character is actually two bytes in GSM7.

With UCS2 all characters should all be two bytes.

I would set your character set to UCS2 and see if the reported number of characters make more sense.


Jan 27, 2020, 07:24 am Last Edit: Jan 27, 2020, 07:25 am by jwallis
Ha, when I said I didn't want to go down the character set route, I didn't think about the device supporting multiple character sets!  You are almost definitely right about this being related.  Tilde and curly braces are all "extended" 2-byte characters.
Research: https://www.clockworksms.com/blog/the-gsm-character-set/

I'm not sure that's the whole issue.  I sent a couple SMSs and am still seeing the "incorrect" character count, except now I'm putting "incorrect" in quotes : )  The research link above says that the the extended 2-byte characters should be an ESC character followed by something else, but that ESC char is not being produced by the SIM7000A.  If it were, I could just filter it out and I *might* be ok.

Here's me trying with the 3 different supported char sets.  I even tried going into GSM mode prior to sending an SMS, but the result is the same.  Someone on another forum said he talked with SimCom, so I'm going to try to bring this to their attention and see what they say.  Thanks again for the reply!

Changing char sets, same result:
Code: [Select]

// list supported character sets

+CSCS: ("IRA","GSM","UCS2")

// try IRA - this is the default


+CMGR: "REC UNREAD","+15127500974",,"20/01/27,05:54:46+00",145,4,0,0,"+447797704131",145,4

+CMGR: "REC READ","+15127500974",,"20/01/27,05:54:04+00",145,4,0,0,"+447797704131",145,5

// try GSM


+CMGR: "REC READ","+15127500974",,"20/01/27,05:54:46+00",145,4,0,0,"+447797704131",145,4

+CMGR: "REC READ","+15127500974",,"20/01/27,05:54:04+00",145,4,0,0,"+447797704131",145,5

// try UCS2


+CMGR: "REC READ","002B00310035003100320037003500300030003900370034",,"20/01/27,05:54:46+00",145,4,0,0,"002B003400340037003700390037003700300034003100330031",145,4

+CMGR: "REC READ","002B00310035003100320037003500300030003900370034",,"20/01/27,05:54:04+00",145,4,0,0,"002B003400340037003700390037003700300034003100330031",145,5


I understand that you want the reported length parameter to be 4 for either "aaaa" or "aa}a"

I'm not clear why you need this length parameter or how you want to use it, but you should be able to parse the +CMGR message and get the last piece of data into a separate character array (c-string) and test for its length. It should be 4 in both cases.



I discussed above that the Adafruit library (adopted and extended by the Botletics library) use the reported length parameter to determine how many characters are in the SMS, and therefore how many characters it should read from the serial connection to the SIM7000A. 

For my project, no incoming messages should have a newline, so I could simply stop reading when I get a '\r' or a '\n' or until serial.available() is but this will not work for all cases for all users of these libraries. 

As shown above, if the SMS is "a{aa" then the reported length param will be 5 but the data available on the serial connection will not be "a{aa" but instead "a{aa\r" so the last piece of data you will read from the serial connection will have that extra '\r' in it.  That is if I'm understanding you correctly.  These libraries are just doing
Code: [Select]
  while (numChars > 0) {             // reported length param
    if (mySerial->available()) {
      replybuffer[idx] = mySerial->read();

For my project, the solution is clear because the user will never send special characters over SMS unless they are sending the Hologram Device Key, which is exactly 8 chars, so I just have a line: devKey[9] = '\0';

I really appreciate your interest in this issue.  Likewise I am simply trying to help anyone who uses the Adafruit or Botletics library, because there are a lot of them.  Please lmk if I misunderstand you in the your post.


I really appreciate your interest in this issue.  Likewise I am simply trying to help anyone who uses the Adafruit or Botletics library, because there are a lot of them.
Yes, if those libraries are using the reported length parameter to read replies from a buffer, there will be an issue.

I would report an issue on the Adafruit GitHub site for the library


Feb 04, 2020, 02:05 am Last Edit: Feb 04, 2020, 02:23 am by jwallis
That's a good call, I actually hadn't thought of doing that.  Although I think there is no good solution other than to make their users aware.

Hm, that's actually a little awkward since Adafruit doesn't sell any boards with a SIM7000A on it...

That said, this probably happens with the SIM808 as well.  I'll log it.

Go Up