Morse Code encoder

Mixture of High School and Adults thru 70's.

Not everyone learns at the same pace.

it's just the way it is!

For the decoder, I would be inclined to convert the placed into a string and then use strcmp to compare that string to the values in letters and numbers.

That is: make allPositions and array of char and put dots and dashes into it rather than 1s and 2s. Add a final '\0' to terminate the string, then loop through your letters and numbers arrays to find a match. if it matches a letter, add the position to 'A' to get the letter. If it matches a number, add '0' to the position to get the digit.

PaulMurrayCbr:
For the decoder, I would be inclined to convert the placed into a string and then use strcmp to compare that string to the values in letters and numbers.

<...>

Painfully inefficient.

Ray

mrburnette:
Painfully inefficient.

Ray

I see your point. However, Morse is rarely transmitted at a rate of more than a few characters a second. What is the need for efficiency?

Just because there is no "need for efficiency doesn't mean one shouldn't strive for the best solution possible. It such mental stretching that helps us learn and improve. Believe it or not, the record for Morse right now is 140 words per minute. A word is defined at 5 characters, with a timing equivalent to the code for the word "PARIS". 700 characters per minute is clipping along at a pretty brisk pace. I would guess that most Morse users fall in the 15-25 wpm category, which is only about 2 characters per second.

But run time efficiency isn't the only factor in what makes the best solution. I think you must agree.

aarg:
But run time efficiency isn't the only factor in what makes the best solution. I think you must agree.

? Well, I do not disagree that I write differently depending on whether I intend on publishing on my webpage or whether I am writing for profit. Generally, when being paid, I write first for maintainability and ability to have meaningful diagnostic messages! I evaluate both against performance. For publishing, I generally attempt to code to a simplier-to-explain syntax unless I am showcasing a specific technique.

But, I would rarely do a linear search if I could easily implement a binary search or an indexed search. Poorly implementing an approach IMO is not productive or instructable.

Ray

Actually, a solution that reflects the actual problem as faithfully as possible, usually yields the best results both in code clarity and efficiency. Obviously there is some tension between the two, which the programmer can trade off according to personal taste or application needs.

I am revolted by needless inefficiencies, but if there seems to be a rationale or justification for it, I have an open mind. I have only rarely been paid to code, but my coding style doesn't change very much except between what is explicitly test code and final code.

Actually, knowing well that there are many people whose programming knowledge far outstrips mine, motivates me to run a tight ship so I don't screw myself up.

I think for Morse decoding, I would not want to convolute dot-dash and character resolution. I would try to recognize a full symbol and only then translate it to a character.

I have a Morse code sender here if that is any help.

Now that we have encoding Morse pretty well nailed down, how 'bout decoding Morse code, which is a little more interesting. Some interesting ways are described at:

Ragnar Aronsen: http://raronoff.wordpress.com/2010/12/16/morse-endecoder/

Bud Churchward: http://www.pg9hf.nl/wb7fhc/The%20WB7FHC%20CW%20Decoder%20v%201.1.html

(There'a another one, but I can't find it.)

The problem with decoding is that people have different "fists", or rhythms, when sending code that doesn't follow the mathematical timing that's required. The ARRL sends practice code over the air on a regular basis and it is mathematically perfect...up to 18wpm. It then starts using Farnsworth encoding, which throws the timing out the window. The best hardware/software solution we could develop using human inputs was about 30wpm, and even then it wasn't 100%, but enough to read it.

I seem to remember writing a decoder a while back. Here it is from my sketches folder:

// Morse code decoder
// Author: Nick Gammon
// Date: 31 August 2015

const byte SIGNAL_PIN = 2;
const int SIGNAL_COUNT = 6;  // maximum number of dots/dashes in a letter

// calculated later on
unsigned long dotLength = 0;  // mS
unsigned long dashLength;  
unsigned long wordLength; 
unsigned long FUZZ_FACTOR;

volatile unsigned int widths [SIGNAL_COUNT];
volatile byte count;
volatile bool letterDone;
volatile bool haveSpace;
volatile bool adjusting = true;

volatile unsigned long lastPulse;

char * letters [26] = {
   ".-",     // A
   "-...",   // B
   "-.-.",   // C
   "-..",    // D
   ".",      // E
   "..-.",   // F
   "--.",    // G
   "....",   // H
   "..",     // I
   ".---",   // J
   "-.-",    // K
   ".-..",   // L
   "--",     // M
   "-.",     // N
   "---",    // O
   ".--.",   // P
   "--.-",   // Q
   ".-.",    // R
   "...",    // S
   "-",      // T
   "..-",    // U
   "...-",   // V
   ".--",    // W
   "-..-",   // X
   "-.--",   // Y
   "--.."    // Z
};

char * numbers [10] = 
  {
  "-----",  // 0
  ".----",  // 1
  "..---",  // 2
  "...--",  // 3
  "....-",  // 4
  ".....",  // 5
  "-....",  // 6
  "--...",  // 7
  "---..",  // 8
  "----.",  // 9
  };
  
// ISR
void gotPulse ()
  {
    
  unsigned long now = millis ();
  unsigned int width = now - lastPulse;
  lastPulse = now;
  
  byte pinState = digitalRead (SIGNAL_PIN);
  
  if (!adjusting)
    {
    // a long gap means we start again
    if (pinState == HIGH && width >= (dashLength - FUZZ_FACTOR))
      count = 0;
  
    // a really long gap means we a space
    if (pinState == HIGH && width >= (wordLength - FUZZ_FACTOR))
      haveSpace = true;
    }
    
  if (count >= SIGNAL_COUNT)
    return;
    
  if (pinState == LOW && !letterDone)
    widths [count++] = width;
  }
  
void processCharacter ()
  {
  if (haveSpace)
    {
    Serial.print (" ");
    haveSpace = false;
    }
         
  char result [SIGNAL_COUNT + 1];  
  for (int i = 0; i < count; i++)
    {
    unsigned int width = widths [i];
    if (width < dotLength + FUZZ_FACTOR)
      result [i] = '.';
    else
      result [i] = '-';
    }
  result [count] = 0;  // null terminator
  
//  Serial.print (result);
//  Serial.print (" ");
  
  for (int i = 0; i < 26; i++)
    if (strcmp (result, letters [i]) == 0)
      {
      Serial.print (char (i + 'A'));
      return;
      }
      
  for (int i = 0; i < 10; i++)
    if (strcmp (result, numbers [i]) == 0)
      {
      Serial.print (char (i + '0'));
      return;
      }
  
  Serial.print ("?");   
  }  // end of processCharacter
 
void calculateWidths ()
  {
  // ignore the first one, might be spurious
  dotLength = widths [1];
  
  for (int i = 2; i < count; i++)
    {
    unsigned int width = widths [i];
    if (width < dotLength / 2)  // less than half the length?
      dotLength = width;
    }  
    
  dashLength = dotLength * 3;  
  wordLength = dashLength * 2; 
  FUZZ_FACTOR = dotLength / 10;  
  adjusting = false;
  count = 0;
  Serial.print ("Dot width = ");
  Serial.println (dotLength);
  }  // end of calculateWidths
  
void setup ()
  {
  Serial.begin (115200);
  Serial.println ();
  Serial.println ("Starting ...");
  attachInterrupt (0, gotPulse, CHANGE);
  EIFR = bit (INTF0);  // clear flag for interrupt 0
  }  // end of setup

void loop ()
  {
  if (adjusting && count >= SIGNAL_COUNT)
    calculateWidths ();

  if (adjusting)
    return;
    
  if (digitalRead (SIGNAL_PIN) == LOW && 
     (millis () - lastPulse) >= (dotLength * 2) &&
     count > 0
     )
    {
    letterDone = true;
    processCharacter ();
    count = 0;
    letterDone = false;
    }
  }  // end of loop

It auto-adjusts to the sending speed after a bit of time "learning" the length of a dot and dash.

Original thread: morse decoder problem - #9 by nickgammon - Programming Questions - Arduino Forum

Example of decoding a morse 'C' string using a lookup table.

If you're interested I can post the code the generates the 'a_decode' table based on the 'morse' table which would allow you to add or remove the morse characters you wish to allow.

The version encoded here is a six element encoding to include some punctuation characters.

#define NUM_ENTRIES(ARRAY)      (sizeof(ARRAY) / sizeof(ARRAY[0]))

enum { oDIT = 1, oDAH = 64 };

char a_decode[] =
{
      '*', 'E', 'I', 'S', 'H', '5', '*' , '*'
    , '4', '*', '*', 'V', '*', '*', '*' , '3'
    , '*', '*', 'U', 'F', '*', '*', '*' , '*'
    , '*', '*', '*', '*', '?', '*', '2' , '*'
    , '*', 'A', 'R', 'L', '*', '*', '*' , '*'
    , '*', '*', '*', '*', '*', '.', '*' , '"'
    , '*', 'W', 'P', '*', '*', '*', '*' , '*'
    , '*', 'J', '*', '*', '*', '1', '\\', '*'
    , 'T', 'N', 'D', 'B', '6', '*', '-' , '*'
    , '*', '*', 'X', '/', '*', '*', '*' , '*'
    , '*', 'K', 'C', '*', '*', '*', '*' , '*'
    , '*', 'W', '*', '*', ')', '*', '*' , '*'
    , 'M', 'G', 'Z', '7', '*', '*', '*' , '*'
    , ',', 'Q', '*', '*', '*', '*', '*' , '*'
    , 'O', '*', '8', ':', '*', '*', '*' , '*'
    , '*', '9', '*', '*', '0', '*', '*' , '*'
};

void loop()
{   }

void setup()
{
    Serial.begin(9600);
    
    // 'DECODE' THIS LIST OF MORSE CHARACTER STRINGS
    // THIS TABLE INCLUDES ALL CHARACTERS ENCODED IN THE 'a_decode' LOOKUP TABLE
    const char* morse[]
    {
          ".-"      // A
        , "-..."    // B
        , "-.-."    // C
        , "-.."     // D
        , "."       // E
        , "..-."    // F
        , "--."     // G
        , "...."    // H
        , ".."      // I
        , ".---"    // J
        , "-.-"     // K
        , ".-.."    // L
        , "--"      // M
        , "-."      // N
        , "---"     // O
        , ".--."    // P
        , "--.-"    // Q
        , ".-."     // R
        , "..."     // S
        , "-"       // T
        , "..-"     // U
        , "...-"    // V
        , "-.--"    // W
        , ".--"     // W
        , "-..-"    // X
        , "--.."    // Z
        , "-----"   // 0
        , ".----"   // 1
        , "..---"   // 2
        , "...--"   // 3
        , "....-"   // 4
        , "....."   // 5
        , "-...."   // 6
        , "--..."   // 7
        , "---.."   // 8
        , "----."   // 9
        , "-....-"  // -
        , "--..--"  // ,
        , "---..."  // :
        , "..--.."  // ?
        , ".-.-.-"  // .
        , "-.--.-"  // (
        , "-.--.-"  // )
        , "-..-."   // /
        , ".----."  // \
        , ".-.--."  // "
    };

    for ( size_t n = 0; n < NUM_ENTRIES(morse); n++ )
    {
        const char* psz = morse[n];
        size_t      o   = 0;

        for ( size_t iDah = oDAH; *psz; psz++, iDah >>= 1 )
        {
            o += (('.' == *psz) ? oDIT : iDah);
        }

         Serial.println(a_decode[o]);
    }
}

MUM_ENTRIES

Shouldn't that be NUM_ENTRIES?

Well, it would make more sense!

EDIT: At least it was tested.

econjack:
Now that we have encoding Morse pretty well nailed down, how 'bout decoding Morse code, which is a little more interesting. Some interesting ways are described at:
<...>
The problem with decoding is that people have different "fists", or rhythms, when sending code that doesn't follow the mathematical timing that's required. The ARRL sends practice code over the air on a regular basis and it is mathematically perfect...up to 18wpm. It then starts using Farnsworth encoding, which throws the timing out the window. The best hardware/software solution we could develop using human inputs was about 30wpm, and even then it wasn't 100%, but enough to read it.

I cannot attest to other algorithms, but Magic Morse can take any of the ARRL's sample files and reproduce them 100%. I've also usually have one pre-built unit at the local Stone Mountain Hamfest for folks to play around with ... I have never taken a pre-build unit back home as someone always buys it as I price them for cost recovery only.

To test, I use a single NPN which is saturated by the output from the PC audio which I run through a small matching transformer and then through a simple RC filter to drive the transistor base. This works like a key, so the Magic Morse trainer can decode OTA, too.

Prosigns decoded:

// ITU (International Morse Code) decoding: The MM[] matrix is decoded in 6-elements to provide for prosigns
// http://upload.wikimedia.org/wikipedia/en/thumb/5/5a/Morse_comparison.svg/350px-Morse_comparison.svg.png
char MM[] PROGMEM = "_EISH5ee0TNDB6-0"    //   0 - 15      e == ERROR
                    "00ARLw0000MGZ700"    //  16 - 31      w == WAIT
                    "000UF0000i0KC000"    //  32 - 47      i == INVITE
                    "000WP000000O0800"    //  48 - 63
                    "0000Vu]00000X/00"    //  64 - 79      u == UNDERSTOOD  ] == End Of Work
                    "00000+.00000Q000"    //  80 - 95
                    "000000?00000Y()0"    //  96 - 111     () == Left/Right hand bracket
                    "0000J0000000e900"    // 112 - 127
                    "000004(c) M.R=BU"    // 128 - 143
                    "RNETTE'0000000,0"    // 144 - 159     ' @ [150] should be "
                    "00>0000000000[00"    // 160 - 175     [ == Starting Signal
                    "000000@000000000"    // 176 - 191
                    "0000030000000000"    // 192 - 207
                    "0000000000000000"    // 208 - 223
                    "0000020000000000"    // 224 - 239
                    "000001'000000000";   // 240 - 255

Dot / Dash timing shown below... a sliding-window is used to do the actual character decoding.

void setspeed(byte value)  // see:http://kf7ekb.com/morse-code-cw/morse-code-spacing/  
{
  WPM        = value;
  DITmS      = 1200 / WPM;
  DAHmS      = 3 * 1200 / WPM;
  // character break is 3 counts of quiet where dah is 3 counts of tone
  // wordSpace  = 7 * 1200 / WPM;
  wordBreak  = 7 * DITmS;    // changed from wordSpace*2/3; Key UP time in mS for WORDBREAK (space)
  Elements   = MaxElement;   // International Morse is 5 characters but ProSigns are 6 characters
  halfDIT    = DITmS/2;      // Minimum mS that Key must be UP (quiet) before MM assignment to dot/dash
  quarterDIT = DITmS/4;      // Minimum accepted value in mS for a DIT element (sloppy)
  halfDAH    = DAHmS/2;      // Maximum accepted value in mS for a DIT element (sloppy)
  DITDAH     = DITmS + DAHmS;// Maximum accepted value in mS for a DAH element (sloppy)
  DiDiDi     = DITmS * 3;    // Minimum mS that Key must be up to decode a character via MM
}

Ray

Quick and dirty C++ utility to build 'a_decode' table for morse decoder in post #31.

#include <cstdlib>
#include <cstring>
#include <iostream>

#define NUM_ENTRIES(ARRAY)      (sizeof(ARRAY) / sizeof(ARRAY[0]))

struct encode_t
{
    const char      ch;
    const char*     psz;
};

encode_t    encode[] =
{
      { 'A' , ".-"     }
    , { 'B' , "-..."   }
    , { 'C' , "-.-."   }
    , { 'D' , "-.."    }
    , { 'E' , "."      }
    , { 'F' , "..-."   }
    , { 'G' , "--."    }
    , { 'H' , "...."   }
    , { 'I' , ".."     }
    , { 'J' , ".---"   }
    , { 'K' , "-.-"    }
    , { 'L' , ".-.."   }
    , { 'M' , "--"     }
    , { 'N' , "-."     }
    , { 'O' , "---"    }
    , { 'P' , ".--."   }
    , { 'Q' , "--.-"   }
    , { 'R' , ".-."    }
    , { 'S' , "..."    }
    , { 'T' , "-"      }
    , { 'U' , "..-"    }
    , { 'V' , "...-"   }
    , { 'W' , "-.--"   }
    , { 'W' , ".--"    }
    , { 'X' , "-..-"   }
    , { 'Z' , "--.."   }

    , { '0' , "-----"  }
    , { '1' , ".----"  }
    , { '2' , "..---"  }
    , { '3' , "...--"  }
    , { '4' , "....-"  }
    , { '5' , "....."  }
    , { '6' , "-...."  }
    , { '7' , "--..."  }
    , { '8' , "---.."  }
    , { '9' , "----."  }

    , { '-' , "-....-" }
    , { ',' , "--..--" }
    , { ':' , "---..." }
    , { '?' , "..--.." }
    , { '.' , ".-.-.-" }
    , { '(' , "-.--.-" }
    , { ')' , "-.--.-" }
    , { '/' , "-..-."  }
    , { '\\', ".----." }
    , { '\"', ".-.--." }
};

void dump(const char* const psz)
{
    using std::cout;
    using std::endl;
    using std::string;

    cout << "const char* a_decode[] =\n{\n";

        size_t      count   = 0;
        const char* p       = psz;
        while ( *p )
        {
            cout << ((p == psz) ? "\t  \'" : "\t, \'") << *p << ((count = ((count + 1) % 8)) ? "\'" : "\'\n");
            p++;
        }

    cout << "};" << endl;
}

int main(int const argc, const char* const argv[])
{
    using std::cout;
    using std::endl;

    size_t  num_elements = 0;

    for ( size_t n = NUM_ENTRIES(encode); n--; )
    {
        size_t  count = strlen(encode[n].psz);

        if ( num_elements < count )
        {
            num_elements    = count;
        }
    }

    cout << "oDAH = " << (1 << num_elements) << endl << endl;

	// --- allocate, fill (with '*') and zero-terminate our 'a_decode' array

    size_t  n = 0;
    for ( size_t i = 0; i < num_elements; i++, n += (1 << i) )
    {   }

    ++n;
    char sz[n + 1];

    sz[n] = 0;

    memset(sz, '*', NUM_ENTRIES(sz));

	// --- populate the translation array 'a_decode'

    for ( size_t i = 0; i < NUM_ENTRIES(encode); i++ )
    {
        const char* psz = encode[i].psz;
        size_t      o   = 0;
        for ( size_t oDash = (1 << num_elements); *psz; oDash >>= 1 )
        {
            o += ((*psz++ == '-') ? oDash : 1);
        }

        sz[o] = encode[i].ch;
    }

	// --- dump 'a_decode' to copyable form for moving to decoder

    dump(sz);

    return EXIT_SUCCESS;
}