Help wanted recovering source file

With a bit more tweaking this is the sort of stuff I am getting. Unprintable characters are replaced by their hex contents, in brackets.

<06><80>DATA$A0
<06><85>DATA[SQ<0B><06><90>DATA"DEL
<06><95>DATA$C0
<07><00>DATA[DE<0B><07><05>DATA"FIN
<07><10>DATA$B0
<07><15>DATA[FI<0B><07> DATA"SCR
<07>%DATA$00
<07>0DATA[SC<0B><07>5DATA"FIX
<07>@DATA$B0
<07>EDATA[FX<0B><07>PDATA"LOA
<07>UDATA$00
<07>`DATA[LD<0B><07>eDATA"DUM
<07>pDATA$00
<07>uDATA[DM<0B><07><80>DATA"ASS
<07><85>DATA$00

I seem to be getting bit shifts. For example:

<06><80>DATA$A0

The bit patterns there for "<06><80>" are:

0000 0110 1000 0000

Whereas they should be newline/space (<0A><20>):

0000 1010 0010 0000

I've been able to modify my FT program to read example .WAV files of the Kansas City Standard without errors, but so far, my attempts to fiddle with the frequencies and baud rates to match your data haven't led to anything useful.

If you were in the U.S., the NSA would probably have a backup copy of your source code. You might ask them!

Very funny! I think this one would be beyond their reach. The tape has lain in a box for about the last 30 years. It wasn't in any way connected to any network. :slight_smile:

I've added more analysis:

Average low time = 448
Average high time = 230

That's been pretty consistent (in µs) - that is ±1 in each case.

So far I have the following theories:

  • The data on the tape is irretrievably damaged due to the passage of time.
  • My transcription to a disk file using Audacity was faulty, possibly due to wrong input levels or noise.
  • The tape recorder (due to its age) corrupted the data on the tape.
  • The code has a design or implementation flaw.
  • The circuit for converting analog line input to digital is faulty (schematic below)
  • The circuit needs improving, for example adding a Schmidt trigger.
  • The tape speed is not constant (however the figures posted above seem to belie this theory)
  • The parameters (eg. 0-bit length, 1-bit length) need tweaking

Op amp audio converter.png

It is possible to "hand decode" the characters, as shown below. I've done that for a few examples scattered through the file, and it is pretty clear why my program didn't work. The data appear to be encoded as 9 bits: start bit, 8 data bits, parity (odd) and stop bit, for a total of 11 bits at roughly 550 baud in these examples. The tape speed isn't quite constant though.

The record quality is pretty good -- you can even see that the generating hardware does not phase match the bit boundaries, as it should.

You should be able to decode this file. Although my first attempt to ignore the 9th bit didn't lead to anything useful, I'll fiddle with it a bit more.

The two characters top then bottom below, including the start and stop bits, are in binary
01101010101
01100110011
so, not counting start and stop, the total number of one bits (including the parity bit) seems to be odd.

Well, closer! The choice of frequencies and data rate is obviously extremely important, and tape speed variations will have to be taken into account.

Here is a snippet. Note the date!
I wonder if some of the unprintable characters are binary markers or line numbers.

 *<15><00>0* EDITOR/ASSEMBLER<05><00>@* 
<0D><00>P* 12/11/78<05><00>`* <0B><00>p* PART 1<05><00><80>* <8B><01><00>G1JMP GO
<0D><01><02>* POINTERS<90><01><05>STDATA$4000  <8E><01><10>LTDATA$4C00<8E><01><15>SYDATA$4C00<8E><01> LYDATA$6000<8C><01>%SODATA[
OP<8C><01>0LODATA[PE<8C><01>5SMDATA[MT<8C><01>@LMDATA[ME<0F><01>E* WORK AREAS<8E><01>PA1ORG $0000<09><01>UJMP G1<8B><01>`TS
RSV 02<8B><01>eTCRSV 02
B<FF>!<00><8B><01>pTERSV 02<8B><01>uTLRSV 02<8B><01><80>SSRSV 02<8B><01>
<85>SCRSV 02<8B><01><90>SERSV 02<8B><01><95>SLRSV 02<8C><02><00>T1DATA$01<8B><02><05>MSRSV 02<8B><02><10>MERSV 02<8B><02><15>MDRSV 02<8B><02> MNRSV 02<8B><02>%CHRSV 01<8B><02>0ENRSV 01<8B><02>5LWRSV 02<8C><02>@HIDATA[IE<8B><02>ESXRSV 02<8B><02>PPSRSV 01<8B><02>UOPRSV
 01<8B><02>`OMRSV 01<8B><02>eLCRSV 02<8B><02>pLSRSV 01<8B><02>uECRSV 02<8B><02><80>O1RSV 01<8B>
B<FF>"<00><02><85>O2RSV 01<8B><02><90>O3RSV 01<8B><02><95>O4RSV 07<8B><03><00>OLRSV 01<8B><03><05>CLRSV 02<8C><03><10>LADATA[IL<8B><03><15>UARSV
 01<8B><03> U2RSV 01<8B><03>%U3RSV 01<8B><03>0U4RSV 01<8B><03>5UNRSV 01<8B><03>@U5RSV 01<8B><03>ETYRSV 01<8B><03>PBIRSV 01<8B>
<03>UB2RSV 01<8B><03>`AQRSV 01<11><04><00>* INPUT/OUTPUT<0F><04><05>* WORK AREAS<04><04><10>*<0F><04><15>* INPUT LINE<8B><04> ILRSV 
30<8B><04>%IERSV 01<10><04>0* OUTPUT LINE<8A><04>5DLDATA"<8B><04>@DNRSV 04<08><04>
EDATA"<8B><04>PDRRSV 06<08><04>UDATA"<8B><04>`D1RSV 01<8B><04>eD2RSV 01<8B><04>pD3RSV 01<8B><04>uDARSV 01<08><04><80>DATA"<8B><04><85>D
ORSV 30
<0D><04><90>* MESSAGES<09><04><95>ORG A1
<8D><05><00>EMDATA"ERR<0B><05><05>DATA"OR:<08><05><10>DATA"<8B><05><15>E1RSV 02<0A><05> DATA$0D
<8D><05>%RYDATA"REA<0A><05>0DATA"DY<0A><05>5DATA$0D
<0D><06><00>* COMMANDS
<0D><06><05>* --------<15><06>
B<FF>$<00><10>* COMMAND: 3 CHARS<12><06><15>* FLAGS: 8 BITS
<14><06> * :8 BASE LINE NO
<0D><06>%* :4 RANGE<11><06>0* :2 INCREMENT<13><06>5* :1 TEXT STRING<15><06>@* :8 <<AVAILABLE>><15><06>E* :4 <<A
VAILABLE>><15><06>P* :2 <<AVAILABLE>><15><06>U* :1 <<AVAILABLE>><11><06>W* ADDRESS OF  <17><06>X* PROCESS

The current code, which reads the original file recorded at 44100 sps. Many of the comments have not been corrected for the changed sample rate.

//KCS tape decode, fourier transform version.
// works perfectly with example file good-example.wav (resampled to 8 bits and 9600 bps)
// modified for Nick Gammon's tape

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <math.h>
//for Code::Blocks
// project->build options->search directories, add (lib,include)
// linker settings: link libraries, add path to libsndfil\lib\libsndfile.lib
// copy bin\libsndfile-1.dll to source directory
//  ...?? (can't get this to work) statically link by adding to linker options -static
#include <sndfile.h>

#define NPERBAUD 80
#define PI 3.1415926536

// uart variables
#define IDLE 1
#define START 2
#define RUN 4
#define RXFLAG 8
#define FRAMERR 16
#define STOP 32
#define PARITY 64

int coeffloi[NPERBAUD],coeffloq[NPERBAUD],coeffhii[NPERBAUD],coeffhiq[NPERBAUD];
void initFSK(void)
{
  int i;

  for (i=0; i<NPERBAUD; i++)
  {
    coeffloi[i] = 100*cos(2*PI*i/NPERBAUD*2200/550);
  	coeffloq[i] = 100*sin(2*PI*i/NPERBAUD*2200/550);
    coeffhii[i] = 100*cos(2*PI*i/NPERBAUD*4400/550);
	coeffhiq[i] = 100*sin(2*PI*i/NPERBAUD*4400/550);
//	printf("%2d %5d %5d %5d %5d\n",i,coeffloi[i],coeffloq[i],coeffhii[i],coeffhiq[i]);
  }
}

/*
https://sites.google.com/site/wayneholder/attiny-4-5-9-10-assembly-ide-and-programmer/bell-202-1200-baud-demodulator-in-an-attiny10
Modified for Kansas City Standard, 300 baud, 1200/2400
Sample the incoming FSK signal at 9600 samples/second (8 times the 1200 Hz frequency used in Bell 202 modulation)
Pass this through two digital filters, one tuned to 1200 Hz and the other to 2400 Hz.
The function is called for index = 0 through length(data) - 8
After stepping though at least 8 samples, the value returned from demodulate() will be >0 if it has
demodulated a 2400 Hz tone, or < 0 for a 1200 Hz tone.
*/
      int demodulate (signed char data[]) {

      int outloi = 0, outloq = 0, outhii = 0, outhiq = 0;
      int ii;
      int sample;
        for (ii = 0; ii < NPERBAUD; ii++) {
          sample = data[ii];
          outloi += sample * coeffloi[ii];
          outloq += sample * coeffloq[ii];
          outhii += sample * coeffhii[ii];
          outhiq += sample * coeffhiq[ii];
        }
        return (outhii >> 8) * (outhii >> 8) + (outhiq >> 8) * (outhiq >> 8) -
               (outloi >> 8) * (outloi >> 8) - (outloq >> 8) * (outloq >> 8);
      }
int main()
    {
// file handling

    SNDFILE *sf;
    SF_INFO info;
    int num_channels;
    int num, num_items;
    int *buf;
    int f,sr,c;
    int i,j;
    FILE *out;

// decode and uart

    static char rx,ch;
    static short status = IDLE;
    static int clock = 0;		// counter for sample
    static int bit = 0;			// bit counter
    static char parity=0;
    signed char buf2[NPERBAUD]={0};    //signal samples
    int k,spb=NPERBAUD; //samples per bit

    int result,output,count=0;

    initFSK();

    /* Open the WAV file. */
    info.format = 0;
    sf = sf_open("AE44k.wav",SFM_READ,&info);
//    sf = sf_open("CID-test.wav",SFM_READ,&info);
   if (sf == NULL)
        {
        printf("Failed to open input file.\n");
        exit(-1);
        }
    /* Print some of the info, and figure out how much data to read. */
    f = info.frames;
    sr = info.samplerate;
    c = info.channels;
    printf("frames = %d\n",f);
    printf("sample rate = %d\n",sr);
    printf("channels = %d\n",c);
    printf("format = %X\n",info.format);
    num_items = f*c;
    printf("num_items = %d\n",num_items);

    /* Allocate space for the data to be read, then read it. */
    buf = (int *) malloc(num_items*sizeof(int));

    num = sf_read_int(sf,buf,num_items);
    sf_close(sf);
    printf("Read %d items\n",num);

//Write data and demodulator output to filedata.out.
   out = fopen("filedata.txt","w");

    k=0;
    for (i = 0; i < num_items - NPERBAUD; i += c)
        {
        for(k=0; k<NPERBAUD-1; k++) buf2[k]=buf2[k+1];  //shuffle to the left
        buf2[NPERBAUD-1] = (signed char) ((unsigned int)buf[i]>>24);
        result = demodulate(buf2);
        if(result>0) result=1; else result=0;

	// now we can build a uart
	// the byte is delivered in 11 bit chunks...
	// in order,
	// start bit (0)
	// bit 0
	// ...
	// bit 7
	// parity
	// stop bit (1)


	// see what our state is
	if (status & IDLE)		// we're idling
	{
		status &= ~RXFLAG;
		status &= ~PARITY;  //clear parity flag
		if (!result)				// falling edge of start bit
		{
			status &= ~IDLE;	//idle no longer
			status |= START;	//started
			bit = 0;					// reset bit counter
			clock = 0;				// and the clock count
			parity = 0;
		}
		//still waiting for end of stop bit
	}
	else
	{
		if (status & START)		//got the falling edge
		{
			if ((clock <= spb/2) && (result))	// oops, false trigger
			{
				status &= ~START;
				status |= IDLE;		// so drop back to idle mode
			}
			else
				clock++;					// otherwise, one more clock
			if (clock == spb/2)			// or are we now in mid start-bit
			{
				status &= ~START;
				status &= ~IDLE;
				status |= RUN;		// so now we're hot to trot
				clock = 0;			// reset counter
			}
		}
		else
		{
			if (status & RUN)		// we're reading data (allegedly)
			{
				if (clock < spb-1)		// time for a sample?
					clock++;				// not yet
				else
				{
					if (bit < 9)		// normal read
                                            {
						if (bit < 8) { //skip parity bit

                                rx = rx>>1;
                                if (result) {
                                rx |= 0x80;
                                parity++;  //count 1 bits
                                }
                                else rx &= 0x7f;
						}
                        clock = 0;
						bit ++;
					}
					else
					{
						if (! result)	// frame error?
						{
							status |= FRAMERR;  //if stop==0
						}
						else
						{
							status &= ~FRAMERR;
						}
						status |= IDLE;
						status |= RXFLAG;
						if (parity&1) status |= PARITY;  //1 if # of 1 bits is odd
						status &= ~RUN;
						status &= ~START;
					}
				}
			}
		}
	}
    if (status & RXFLAG) { //added for debug, sjr
            ch=rx&0x7f;  //remove parity bit
            if(ch == 0x0d) {fprintf(out,"\n"); count=0;}
            if(ch<32 || ch>126) fprintf(out,"<%02X>",(unsigned char)rx);
            else fprintf(out,"%c",ch);
           count++;
            if(count>80) {fprintf(out,"\n"); count=0;}
            } // end if (status & RXFLAG)

	output = (status<<8)+rx;
    }
    fclose(out);
	return 0;
}

I wonder if some of the unprintable characters are binary markers or line numbers.

As I mentioned earlier, the tape data comes in blocks:

  • 25 x 0xFF as filler
  • The letter 'B' (begin?)
  • One-byte binary block length, where 0xFF represents 256 bytes
  • Two bytes binary address, showing where the data came from in memory
  • The data
  • The letter 'G'

Certainly in your above post I see: B!<00> which would be 256 bytes starting at 0x2100. It is quite probable that the source started in a block of RAM at 0x2000.

12/11/78

The date sounds very plausible. I started working on that computer in 1976, and it was probably a couple of years later that I had enough RAM, UVEPROM, and peripherals to be writing an editor/assembler.

The data appear to be encoded as 9 bits: start bit, 8 data bits, parity (odd) and stop bit, for a total of 11 bits ...

Looks like I am going to have to stop believing what the manual says and look at the code they helpfully provided. Mind you, they do claim to have two stop bit, so a total of 11 bits is plausible.

I just ran your code. The first non-garbage thing was:

B<FF> <00>

So that confirms the source started at 0x2000.

I think we can eliminate the "" that are outside B...G blocks, as they are supposed to be filler.

Reading the code a bit more, they write 1024 (or 1023 maybe) of at the start of the tape. Then 25 of between each block. So those s look right.

I have a couple more theories about what might be wrong. The tape was stored for around 38 years. The magnetism might have imprinted from layer to the next, like this:

spiral_tape_imprint.png

Or sideways, as I used one track to record one file, and the other track to record another file:

stereo_tape_imprint.png

@jremington - how do you modify the baud rate in your code? Measuring the fill-in leader I get 19.6 ms between the start bits of the 0xFF so that works out at 561 baud (1 / 0.0196 * 11).

I'm pretty sure about the parity bit, but knowing that the parity is incorrect doesn't do you much good!

The B block start certainly looks to be correct, but there is no sign of a "G" end marker.

I tried baud rate 561, and that clearly helped.

The baud rate and frequencies are in these four lines, with the assumed baud rate being the last constant (561). Also, the number of bits per baud NPERBAUD = sample rate/ baud rate, currently 78.

    coeffloi[i] = 100*cos(2*PI*i/NPERBAUD*2200/561);
  	coeffloq[i] = 100*sin(2*PI*i/NPERBAUD*2200/561);
    coeffhii[i] = 100*cos(2*PI*i/NPERBAUD*4400/561);
	coeffhiq[i] = 100*sin(2*PI*i/NPERBAUD*4400/561);

I've attached what I was able to decode without getting fancy. It appears that there are two copies of Editor/Assembler in the successfully decoded portion of that file. I used an editor to add two newlines at the beginning of each block.

The current code, slightly cleaned up:

//KCS tape decode, fourier transform version.
// works perfectly with example file good-example.wav (resampled to 8 bits and 9600 bps)
// modified for Nick Gammon's tape

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <math.h>
//for Code::Blocks
// project->build options->search directories, add (lib,include)
// linker settings: link libraries, add path to libsndfil\lib\libsndfile.lib
// copy bin\libsndfile-1.dll to source directory

#include <sndfile.h>

// number of sample bits per baud. 44100/561 = 78

#define NPERBAUD 78

#define PI 3.1415926536

// uart variables
#define IDLE 1
#define START 2
#define RUN 4
#define RXFLAG 8
#define FRAMERR 16
#define STOP 32
#define PARITY 64

int coeffloi[NPERBAUD],coeffloq[NPERBAUD],coeffhii[NPERBAUD],coeffhiq[NPERBAUD];
void initFSK(void)
{
  int i;

  for (i=0; i<NPERBAUD; i++)
  {
    coeffloi[i] = 100*cos(2*PI*i/NPERBAUD*2200/561);
  	coeffloq[i] = 100*sin(2*PI*i/NPERBAUD*2200/561);
    coeffhii[i] = 100*cos(2*PI*i/NPERBAUD*4400/561);
	coeffhiq[i] = 100*sin(2*PI*i/NPERBAUD*4400/561);
//	printf("%2d %5d %5d %5d %5d\n",i,coeffloi[i],coeffloq[i],coeffhii[i],coeffhiq[i]);
  }
}

/*
https://sites.google.com/site/wayneholder/attiny-4-5-9-10-assembly-ide-and-programmer/bell-202-1200-baud-demodulator-in-an-attiny10
Modified for Kansas City Standard, 300 baud, 1200/2400
Sample the incoming FSK signal at 9600 samples/second (8 times the 1200 Hz frequency used in Bell 202 modulation)
Pass this through two digital filters, one tuned to 1200 Hz and the other to 2400 Hz.
The function is called for index = 0 through length(data) - 8
After stepping though at least 8 samples, the value returned from demodulate() will be >0 if it has
demodulated a 2400 Hz tone, or < 0 for a 1200 Hz tone.
*/
      int demodulate (signed char data[]) {

      int outloi = 0, outloq = 0, outhii = 0, outhiq = 0;
      int ii;
      int sample;
        for (ii = 0; ii < NPERBAUD; ii++) {
          sample = data[ii];
          outloi += sample * coeffloi[ii];
          outloq += sample * coeffloq[ii];
          outhii += sample * coeffhii[ii];
          outhiq += sample * coeffhiq[ii];
        }
        return (outhii >> 8) * (outhii >> 8) + (outhiq >> 8) * (outhiq >> 8) -
               (outloi >> 8) * (outloi >> 8) - (outloq >> 8) * (outloq >> 8);
      }
int main()
    {
// file handling

    SNDFILE *sf;
    SF_INFO info;
    int num_channels;
    int num, num_items;
    int *buf;
    int f,sr,c;
    int i,j;
    FILE *out;

// decode and uart

    static char rx,ch;
    static short status = IDLE;
    static int clock = 0;		// counter for sample
    static int bit = 0;			// bit counter
    static char parity=0;
    signed char buf2[NPERBAUD]={0};    //signal samples
    int k,spb=NPERBAUD; //samples per bit

    int result,output,count=0;

    initFSK();

    /* Open the WAV file. */
    info.format = 0;
    sf = sf_open("AE44k.wav",SFM_READ,&info);
//    sf = sf_open("CID-test.wav",SFM_READ,&info);
   if (sf == NULL)
        {
        printf("Failed to open input file.\n");
        exit(-1);
        }
    /* Print some of the info, and figure out how much data to read. */
    f = info.frames;
    sr = info.samplerate;
    c = info.channels;
    printf("frames = %d\n",f);
    printf("sample rate = %d\n",sr);
    printf("channels = %d\n",c);
    printf("format = %X\n",info.format);
    num_items = f*c;
    printf("num_items = %d\n",num_items);

    /* Allocate space for the data to be read, then read it. */
    buf = (int *) malloc(num_items*sizeof(int));

    num = sf_read_int(sf,buf,num_items);
    sf_close(sf);
    printf("Read %d items\n",num);

//Write data and demodulator output to filedata.out.
   out = fopen("filedata.txt","w");

    k=0;
    for (i = 0; i < num_items - NPERBAUD; i += c)
        {
        for(k=0; k<NPERBAUD-1; k++) buf2[k]=buf2[k+1];  //shuffle to the left
        buf2[NPERBAUD-1] = (signed char) ((unsigned int)buf[i]>>24);
        result = demodulate(buf2);
        if(result>0) result=1; else result=0;

	// now we can build a uart
	// the byte is delivered in 11 bit chunks...
	// in order,
	// start bit (0)
	// bit 0
	// ...
	// bit 7
	// parity
	// stop bit (1)


	// see what our state is
	if (status & IDLE)		// we're idling
	{
		status &= ~RXFLAG;
		status &= ~PARITY;  //clear parity flag
		if (!result)				// falling edge of start bit
		{
			status &= ~IDLE;	// so we idle no longer
			status |= START;	// we're started
			bit = 0;					// reset bit counter
			clock = 0;				// and the clock count
			parity = 0;
		}
		// else we're still waiting for end of stop bit
	}
	else
	{
		if (status & START)		// aha, we got the falling edge
		{
			if ((clock <= spb/2) && (result))	// oops, false trigger...noise perhaps
			{
				status &= ~START;
				status |= IDLE;		// so drop back to idle mode
			}
			else
				clock++;					// otherwise, one more clock
			if (clock == spb/2)			// or are we now in mid start-bit
			{
				status &= ~START;
				status &= ~IDLE;
				status |= RUN;		// so now we're hot to trot
				clock = 0;			// reset counter
			}
		}
		else
		{
			if (status & RUN)		// we're reading data (allegedly)
			{
				if (clock < spb-1)		// time for a sample?
					clock++;				// not yet
				else
				{
					if (bit < 9)		// normal read
                                            {
						if (bit < 8) { //skip parity bit

                                rx = rx>>1;
                                if (result) {
                                rx |= 0x80;
                                parity++;  //count 1 bits
                                }
                                else rx &= 0x7f;
						}
                        clock = 0;
						bit ++;
					}
					else
					{
						if (! result)	// frame error?
						{
							status |= FRAMERR;  //if stop==0
						}
						else
						{
							status &= ~FRAMERR;
						}
						status |= IDLE;
						status |= RXFLAG;
						if (parity&1) status |= PARITY;  //1 if # of 1 bits is odd
						status &= ~RUN;
						status &= ~START;
					}
				}
			}
		}
	}
    if (status & RXFLAG) { //added for debug, sjr
            ch=rx&0x7f;

            if(ch<32 || ch>126) {
                    fprintf(out,"<%02X>",(unsigned char)rx);
                    count += 4;
            }
            else {
                    fprintf(out,"%c",ch);
                    count++;
            }
            if(count > 79) {fprintf(out,"\n"); count=0;}  //shorten long lines in output
            } // end if (status & RXFLAG)

    }  //end for i to num_items

    fclose(out);

	return 0;
}

filedata_nl.txt (24 KB)

@jremington - I did a bit of experimenting with the file I made with your code. Between two batches of (and disregarding the initial "B,<00>" I found:

G6.#.SUBA$07.#.ORAB$02.#.G7ANDB$FB.$.G8ANDB$FE.$.PSHB.$.LDABUA.$.CMPB"0.$.BEQ G9.$.ORAB$01.$.PULB.$"BRA G6.$%G9LDABU2.$(STABUA.$1LDABU3.$4STABU2.$7LDABU4.$@STABU3.$CSTAAU4.$FPULB.$IINX.$RBRA G5.$UG6LDAAUA.$XASLA.$aASLA.$dASLA.$gASLA.$pSTAAUN.$sLDAAU2.$vAND

I converted the hex codes back to a period, and then counted the results. 256 bytes!

However it seems to be missing the "G" at the end.

Putting the hex codes back in:

G6<0A>#<93>SUBA$07<0A>#<96>ORAB$02<8C>#<99>G7ANDB$FB<8C>$<02>G8ANDB
$FE<07>$<05>PSHB<09>$<08>LDABUA<09>$<11>CMPB"0<09>$<13>BEQ G9<0A>$<16>ORAB$01<07>$<19>PULB<09>$"BRA G6<8B>$%G9LDABU2<09>$(STAB
UA<09>$1LDABU3<09>$4STABU2<09>$7LDABU4<09>$@STABU3<09>$CSTAAU4<07>$FPULB<06>$IINX<09>$RBRA G5<8B>$UG6LDAAUA<07>
$XASLA<07>$aASLA<07>$dASLA<07>$gASLA<09>$pSTAAUN<09>$sLDAAU2<0A>$vAND

Now adding in spaces and newlines where they seem logical we get:

G6<0A>
#<93>SUB A $07<0A>
#<96>ORA B $02<8C>
#<99>

G7  AND B $FB<8C>$<02>

G8  AND B $FE<07>$<05>
    PSH B <09>$<08>
    LDA B UA<09>$<11>
    CMP B "0<09>$<13>
    BEQ G9 <0A>$<16>
    
    ORA B $01<07>$<19>
    PUL B <09>$"
    BRA G6<8B>$%
    
G9  LDA B U2<09>$(
    STA B UA<09>$1
    LDA B U3<09>$4
    STA B U2<09>$7
    LDA B U4<09>$@
    STA B U3<09>$C
    STA A U4<07>$F
    PUL B <06>$I
    INX <09>$R
    BRA G5<8B>$U
    
G6  LDA A UA<07>$X
    ASL A<07>$a
    ASL A<07>$d
    ASL A<07>$g
    ASL A<09>$p
    STA A UN<09>$s
    LDA A U2<0A>$v
    AND

If we assume that <0A> is a newline and <09> is a tab, then we end up with fairly readable source:

G6 
#<93>SUB A $07 
#<96>ORA B $02<8C>
#<99>

G7  AND B $FB<8C>$<02>

G8  AND B $FE<07>$<05>
    PSH B  $<08>
    LDA B UA $<11>
    CMP B "0 $<13>
    BEQ G9  $<16>
    
    ORA B $01<07>$<19>
    PUL B  $"
    BRA G6<8B>$%
    
G9  LDA B U2 $(
    STA B UA $1
    LDA B U3 $4
    STA B U2 $7
    LDA B U4 $@
    STA B U3 $C
    STA A U4<07>$F
    PUL B <06>$I
    INX  $R
    BRA G5<8B>$U
    
G6  LDA A UA<07>$X
    ASL A<07>$a
    ASL A<07>$d
    ASL A<07>$g
    ASL A $p
    STA A UN $s
    LDA A U2 $v
    AND

I'm guessing that the $xx at the end of lines is a comment.

The current code, slightly cleaned up:

Finally I understand what I am looking at! This was written in the days before I knew about C, and its "newline indicates the end of line" stuff. In those days we used Pascal line conventions. A line consisted of:

VLI = variable-line indicator

So for example:

<09>&SLDAATY<07>&VTAP <07>&YPULA

First line: 9 bytes, second line, 7 bytes, third line 7 bytes and so on. (It looks like the VLI is counted as part of the count).

Knowing that we should be able to get rid of most of the unprintable characters, and add in the line breaks.

That is looking pretty good!

I take it that the 1978 source code isn't all flooding back to you, in a burst of admiration?

Not "flooding" but getting there. :slight_smile:

jremington:
The B block start certainly looks to be correct, but there is no sign of a "G" end marker.

I re-read the source code for the output routine. Despite what the manual said, it actually only sends 'G' once, after the final lot of 0xFF. So the 'G' is the end of file marker, not the end of block marker.

So, for example, in the file you uploaded, after the first copy of the assembler:

RAB$02<8C>#<99>G7ANDB$FB<8C>$<02>G8ANDB$FE<07>$<05>PSHB<09>$<08>LDABUA<08>$<11>C
MB"0<00><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF>
<FF><FF><FF><FF><FF><FF><FF>G

There's the "G"!

Nick:

I looked up the MC6850 ACIA chip data sheet, and it has several modes of operation that are set by chip inputs.

As is usual nowadays, serial data can be 7 or 8 bits, 1 or 2 stop bits, and options include 8 bits plus parity (odd or even) with 1 stop bit. So, the manual text might be in error for later versions of the program.

My decoder now counts various errors, and about 1/3 (3655/11041) of the data bytes are reported to have parity errors. I ignored those errors but that gives a rough indication of the frequency of single bit errors. I suspect they are primarily due to recorder speed variations, but your other theories are certainly viable.

Still the data recovery rate is pretty remarkable for the age of the tape. By chance was the tape sitting in the same orientation with respect to the Earth's magnetic field for a few decades?

I won't be able to get to the code for a few hours, but I would expect about half to be wrong parity by random chance, given that it is (possibly) just a stop bit. And then given that there are a lot of 0xFF which would be even on its own, and need a 1-bit for odd parity, that means some extra ones would look "right" anyway. So that could explain your 1/3 that seem to be incorrect parity.

By chance was the tape sitting in the same orientation with respect to the Earth's magnetic field for a few decades?

It would have been sitting in a box for ages, not moved much. I think what I need to do is suppress the printing of newlines (because they aren't "real" - but would just represent a line of length 10) and just output straight hex. Then a suitable program can read that in and do the block decoding, followed by the VLI stuff to break it into lines. We might get quite a good result at this rate.

Looking at some of the source, it seems a bit surprising the lack of spaces, eg.

LDAAUN

But I suppose if you are desperate for RAM, they aren't really needed. That could be interpreted as:

LDA A UN

without too much trouble (where UN is a two-character identifier).

I'm a little surprised that the labels seem to be hard against the code but maybe that will clear itself up when the data is cleaned up into lines.

For example:

B5RTS

I presume that is:

B5 RTS

Especially as there is this before it:

BNE B5

Some of the VLIs seem to have the high-order bit set. I don't know if that is just a read error, or if that was some sort of flag (eg. new page).