Algorithm for converting encoded octets into septets

Can anyone shine some light on decoding encoded octets to septets. I've found various converters online but if someone could help me out with an algorithm or post the link to one it would be great.

Thanks in advance.

PS incase whatever i said was ununderstandable, heres a link to what im trying to write a code for
http://www.smartposition.nl/resources/sms_pdu.html

you can look at the bytes as a stream of bits.

  • Write a function that extracts 7 bits at a time from it (and updates an internal pointer how far it is read) [char mode]
    or
  • Write a function that takes 7 bytes and extracts 8 chars from it. [block mode]

To do this you need to understand bit math like: right and left shifting of bits << and >> and merging parts with | (or) ==> Google bit math

This code should work on a 8-bit AVR platform (little-endian), but it's just a quick hack and untested.

int convert8to7(byte *input, byte *output, int inlength) {
  union {
    byte b[2];
    uint16_t u;
  };
  int outlength = inlength * 8 / 7;
  for (int i = 0; i < outlength; i++) {
    int index = i * 7;
    int m = index % 8;
    index /= 8;
    b[0] = input[index];
    b[1] = index > 0 ? input[index - 1] : 0;
    output[i] = (u >> (8-m)) & 0x7F;
  }
  return outlength;
}

Try these:

/* Decode the packed contents of the specified buffer of length len
 * into an ISO-8859 string.  The output string is null terminated.
 * Return the output string length.  The output string is 8/7ths the
 * length of the input and in-place decoding is not possible.
 */
int nk_decode_characters(char *dest, uint8_t *data, int len)
{
   uint8_t c7;
   char *ptr;
   uint8_t havebits;
   unsigned bitfield, c8;

   havebits = 0;
   bitfield = 0;
   ptr = dest;

   while (len--) {
      c8 = *data++;

//debug_print("data -> 0x%02x from data", c8);

      /* glue the octet on the left side of the bitfield
       */
      c8 <<= havebits;
      bitfield |= c8;
      havebits += 8;

//debug_print(" bitfield 0x%04x %d bits\n", bitfield, havebits);

      /* and shift out the 7-bit characters
       */
      while (havebits >= 7) {
          c7 = bitfield & 0x7f;

//debug_print("   out 0x%02x from bitfield", c7);

          *ptr++ = nk_gsm_chartable[c7];
          bitfield >>= 7;
          havebits -= 7;

//debug_print(" bitfield 0x%04x %d bits\n", bitfield, havebits);

      }
   }
   *ptr = '\0';
   return(ptr - dest);
}
/* Encode the input string in ISO-8859-15 characters
 * into the octet stream dest.  Return the number of
 * octets written.  It is possible for the two
 * buffers to be the same for in-place encoding.
 */
int nk_encode_characters(uint8_t *dest, char *string)
{
   uint8_t i, c8, *ptr;
   uint8_t havebits;
   unsigned bitfield, c7;

   ptr = dest;
   havebits = 0;
   bitfield = 0;

   while (c8 = *string++) {

      /* Search the chartable for a matching character.
       * if there is no match, or the match is to
       * inverted ?, then convert to inverted ?
       */
      for (i = 0; i < 128; i++) {
          if (c8 == nk_gsm_chartable[i]) {
              c7 = i;
              break;
          }
      }
      if (i == 128 || c8 == 191)
           c7 = 96;

// debug_print("%c --> 0x%02x (%c)\n", c8, c7, c7);

       c7 <<= havebits;
       bitfield |= c7;
       havebits += 7;

       if (havebits >= 8) {
          *ptr++ = bitfield & 0xff;
          bitfield >>= 8;
          havebits -= 8;
       }
   }

   if (havebits)
      *ptr++ = bitfield & 0xff;

   return(ptr - dest);
}
/* This table translates 7-bit GSM codes to ISO-8859-15 ones.
 * There are some missmatches, and characters that don't convert
 * to ISO-8859-15 ones are mapped to 191, inverted question mark.
 * The correct GSM inverted ? (96) must be handled as a special
 * case.
 */

uint8_t nk_gsm_chartable[] = {
 /*   0 */    64, 163,  36, 165, 232, 233, 249, 236,
 /*   8 */   242, 199,  10, 216, 248,  13, 197, 229,
 /*  16 */   191,  95, 191, 191, 191, 191, 191, 191,
 /*  24 */   191, 191, 191, 191, 198, 230, 223, 201,
 /*  32 */    32,  33,  34,  35, 164,  37,  38,  39,
 /*  40 */    40,  41,  42,  43,  44,  45,  46,  47,
 /*  48 */    48,  49,  50,  51,  52,  53,  54,  55,
 /*  56 */    56,  57,  58,  59,  60,  61,  62,  63,
 /*  64 */   161,  65,  66,  67,  68,  69,  70,  71,
 /*  72 */    72,  73,  74,  75,  76,  77,  78,  79,
 /*  80 */    80,  81,  82,  83,  84,  85,  86,  87,
 /*  88 */    88,  89,  90, 196, 214, 209, 220, 167,
 /*  96 */   191,  97,  98,  99, 100, 101, 102, 103,
 /* 104 */   104, 105, 106, 107, 108, 109, 110, 111,
 /* 112 */   112, 113, 114, 115, 116, 117, 118, 119,
 /* 120 */   120, 121, 122, 228, 246, 241, 252, 224
};

(edited to remove PROGMEM from nk_gsm_chartable[] since the code I pasted does not do the pgm_read_byte())

Thanks all of you for the help :slight_smile: :slight_smile:

I just realized that the nk_gsm_chartable[] I pasted in had a PROGMEM storage class which would cause strange behaviour on Arduino with the code I pasted. I removed this. The job of placing nk_gsm_chartable[] in PROGMEM properly is left to the reader.