Microsoft "20HAL"
Last Updated 02/12/2024 19:44 -0500
This topic isn't directly related to my work with the Seattle Gazelle, but it is an important, although minor, part of the overall development process of PC-DOS, and somewhat relevant after having hit the 40th anniversary of its introduction in 1981.
If you visit Dan, TheStarman's site at pcministry, there is a discussion about a June, 1981 pre-release version of PC DOS 1.0 that is sometimes arbitrarily called "IBM PC DOS 0.90". This discussion is related to a disk image Dan received in 2006 from a former IBM employee which contained a near-final version of PC-DOS and some development-related tools that never made it to the final disk shipped by IBM. One interesting tool is "20HAL.COM" dated April 23, 1981 (or earlier), about four months before the introduction of the IBM-PC on August 12, 1981. The "20" might refer to the DECSYSTEM-2020 that Microsoft supposedly had at the time, although I haven't verified that. "HAL" is also an interesting choice, with each letter being one off from IBM -- and/or an interesting parallel to a large, maniacal, all-knowing space station computer of the same name.
20HAL is a relatively short program (1.8k) that I believe was used to transfer files between Microsoft in Bellevue, WA and IBM's Entry Systems division in Boca Raton, FL. To my knowledge, no one has examined it closely because, really, how important is it? It was a simple internal tool to get DOS development files to IBM. But, it's an interesting digital archeological project.
Dan, TheStarman's site also references a 2005 interview with Bob O'Rear (https://thestarman.pcministry.com/DOS/ibm100/Exam.htm#DATA) that confirms that IBM did dial into Microsoft's DECSYSTEM-20 (using a 300-baud voice modem) to communicate about development issues. It doesn't specifically mention file transfers, and the reference to "voice modem" I assume means an acoustic coupler. That page also discusses information found in the slack space (unused space on a diskette) on an original PC-DOS 1.10 disk. This space contains references to what might be later versions of this same program ("DEC-20 +++ FAST +++ [Hal version] 11-Oct-81" and, four months later, "Tops-20 Downlink to MS-DOS (Created: 24-Feb-82) [IBM Version] [1200-bps]".
As an aside, I think it's interesting that there's anything in the slack space of a public distribution -- that means that someone made a duplication master out of a previously-used work diskette. A freshly-formatted disk is mostly filled with 0xF6 in the slack space. Admittedly, sector-level disk editing tools didn't exist at first, so there was no way for the casual user to view slack space. I guess you could have loaded it into memory using DEBUG, but that's not nearly as user-friendly as Norton Disk Edit. Anyway, I digress...
I have so far not found anyone who knows exactly how the intra-company communications was setup or how the utility was actually used. There is no embedded help, which is unsurprising since it was an internal tool. So, necessarily, this requires a lot of conjecture based on what would seem logical in that environment in 1980 and 1981. I'll summarize the results here, but you can scroll down for more of the day-to-day analysis. I had several questions:
How was the utility developed and on what system?
What was the communications hardware "setup"? How did it interface with the various machines and how was the connection made?
How was it invoked and used on a day-to-day basis?
A friend of mine, Ryan Ottignon, is working on a history project relating to SCP and the different versions of 86-DOS that were built. He told me of a passage in the book Gates (Stephen Manes and Paul Andrews) which, in Chapter 12 ("DOS Capital"), mentions exchanging development files with IBM:
"Software exchanges took place daily. Every afternoon around five o’clock, someone from Microsoft would round up the disks
and drive down the freeway to Sea-Tac Airport and the Delta DASH package delivery service. Eventually modems were set up
to speed transfers even further."
This passage seems to answer part of the second question above. This chapter goes on to say that the Chess prototype was unreliable (quite possibly due to overheating in the unairconditioned "secure" lab, and the wire wrapping construction), but that DOS and BASIC were running on it by late-January, 1981 (the 22nd to the 25th, specifically). Further, it mentions heavy troubleshooting sessions in March, and the need to have IBM actively engaged in the testing to keep the project on schedule. From a code development perspective, the book describes Bob O'Rear receiving updated code from Tim Paterson @ SCP and doing the adaptation work for Chess in a "...wildly kludgy multistep process that involved a stroll from one end of the building to the other and the use of three different machines, not counting the IBM prototype." In an email exchange I had with Tim Paterson in 2014, Microsoft stored the source on a PDP-10 but the code was actually assembled on a specially-configured Seattle Computer Products ("SCP") S-100 machine (I believe an above-average amount of memory). If Bob was doing the adaptation work, he would be a likely source of the files that needed to be sent to IBM daily. It wouldn't be too much of a leap to think he developed the 20HAL tool if nothing but to save him a ton of time moving files around when he could spend that time writing/debugging code.
In Ryan's research, some of the versions of 86-DOS code he uncovered were stored on Northstar-format (89.6k, 35x10x256, hard-sectored) diskettes. It is not clear to me whether Northstar (Z80) machines were used at SCP for some purpose, or if it was a personal machine owned by that developer (Pat Opalka). This same image contained a clone of the MicroPro Word Master text editor. According to a post on the VCFED forum, it was a decompilation-conversion-recompilation of Word Master for CP/M that was used by Tim Paterson as his primary IDE for later programs.
2024 Updates
Recently a trove of old SCP disks (literally hundreds) have been found by a friend of mine f15sim. Among these disks was the earliest known version of 86-DOS, v0.1C - SN#11. This launched a massive effort to compare it to later known version 0.34 and the released 1.0 version which was parallel to PC-DOS. While those efforts are too detailed to describe here, one document came to light, within files uploaded by Bob O'Rear to the Internet Archive (here). This undated document (although likely not earlier than February 1981) details how the object code came together and was sent to IBM. It also confirms several aspects of the process which I surmised below. Here's what Bob said (with my editorials in brackets):
Seattle Comptuers delivers source of QDOS and utilities on 8" QDOS formatted diskette and an absolute assembler. {Depending on the date, this could be the SMALDISK or LARGDISK format, the difference being a 16-byte directory versus 32-byte which was added in February 1981.}
Incorporate changes to QDOS using EDLIN + assemble into absolute code. {"Absolute code" would likely be a COM file. Using SCP utilities and syntax, this probably means something like "EDLIN 86DOS.ASM", "ASM 86DOS", and "HEX2BIN 86DOS".}
Run an encode program (written in BASIC) that converts absolute code to Intel ASCII hex format. {This is key because file transfers were not "8-bit clean" because of 7-bit terminal ASCII. This produces an ASCII-only file like UUENCODE (which didn't come out until 1983).}
Upload the encoded ASCII hex QDOS to the DEC 2020. {This confirms the system used.}
BIOS for MSDOS written on 2020 in XMACRO-86 a cross assembler. Assemble to absolute location. {I think this refers to DOSIO.ASM, the IO layer that 86DOS calls.}
Download BIOS and ASCII-hex encoded QDOS to Intel ISIS system. {This is new information.}
Encode the BIOS to Intel ASCII-hex on the Intel ISIS machine. {Needed as a bridge between the DEC2020 to get both into HEX??}
Transfer the BIOS + QDOS over IBM RS232 to IBM PC prototype memory via ROM debugger that accepts Intel ASCII hex. {Was the Intel MDS the only one with a serial port?? Doubt it...so was there another machine involved here? The Gazelle definitely had serial ports.}
Use IBM PC prototype debugger to write BIOS + QDOS to exact sector locations of PC prototype 5-1/4 diskette system. {Many system ROM monitors I've used on S100 systems and others have the ability to receive HEX files. Really need a copy of that prototype ROM!!}
Repeat assembly, encoding, etc for QDOS utilities. {Rinse and repeat...}
Whew! Some interesting takeaways:
The process is similar to building a new CP/M system today in that the CBIOS (DOSIO) is assembled with a cross-assembler separately from BDOS (86DOS) and CCP (COMMAND.COM), merged as Intel HEX files, uploaded by the system using a ROM monitor and then written to disk. Look at the section "Implementing a new Disk System" on my IMSAI page.
It confirms that at least initially Microsoft was using ASCII-encoded (7-bit) transfers rather than object code sent using XMODEM or other protocol.
It confirms that Microsoft used a DECSystem 2020 for development
The use of an Intel ISIS system is new information. I'm trying to locate a manual for the DECSystem-hosted XMACRO program to see what the output was. There is a version that was CP/M and ISIS-II hosted which operates like a normal assembler and outputs relocatable object code. The OS/2 Museum has a page on this here which says that the ISIS-II system could directly load and execute OMF (Object Module Format) files popularized by Microsoft. OK, that makes sense, but that's a complex way to translate OMF to HEX.
The page on OS/2 Museum also notes that the system BIOS was likely produced on the ISIS-II using ASM86. I was unsure about the comment about Microsoft not having the right 8086 tools at the time due to the MACRO-86 reference I found...but according to this page, MACRO-86 was released in 1981 as a DOS program. That same page also talks generally about DOS development in the context of developing MASM (which notably was written in Pascal) "...Microsoft had been doing all their development on DEC computers...". In the version history table, this version is missing...but...it exists on the distribution disks of MS-PASCAL 2 (disk 2, 12/23/1981) available on bitsavers or winworldpc.
Breaking down the Code
To start, I ran the code through Sourcer to get a listing file, and I made extensive use of the Scroll Symbolic Tracer by Murray Sargent (itself a program with an interesting Microsoft-connected history). Looking at the embedded text strings revealed TOPS-10 commands and keyboard sequences to print login information, set control code filtering, execute a copy command, and run external programs. The original listing is here.
The code had an odd mix of CP/M style calls (using "CALL 0005") for keyboard input and certain disk functions, calls to the PC BIOS for video and serial communications, and finally two isolated PC DOS INT21H calls for two disk functions. The hallmarks of CP/M, jumping around in the code, stack jumps by pushing an address on the stack and executing a return without balancing the stack just doesn't seem to fit with what I read about IBM's straight-laced development practices. So, I feel that it was developed by either SCP or Microsoft and given to IBM to use.
The CP/M-style calls and the names of the two transfer programs on the PDP-10 (both start with "CPM") would hint at it being originally written on a CP/M 2.2 system and then possibly translated to 8086 using SCP's TRANS86. 86-DOS (and subsequently, MS-DOS) was designed to accept CP/M-compatible system calls as a way to enable users to migrate CP/M programs (Z80-based) to 8086-based platforms. CP/M-86 wasn't released until three months after the PC so I doubt it was developed using that. SCP did not have a Z80 CPU card, but Gazelle-specific versions of both 86-DOS and MS-DOS could use floppy controller cards that were common in Z80 S-100 systems of the era. If the recovered Northstar disks were from a development machine, then the pieces somewhat fit together.
The next question related to how the connection between Microsoft and IBM was made. When looking at the serial communications code, there are no phone numbers or "smart modem" command strings (the Hayesmodem 300 didn't come out until later in 1981, but there were S-100 modems; none were "smart", however, but they could take a "dial" string). The least expensive method would probably have been acoustic couplers on either end of a POTS line, rather than a leased line...but...the core PC-DOS source files were about 300K which, at 300 baud, would take over 2 hours to transfer. This could have been fine at the end of the day when toll rates would have been lower. Using a 56kbps leased line, that transfer would take only about 45 seconds. This would appear to be confirmed by the passage in the Gates book, although the type of modem was non-specific.
As you get into the code, there are some small clues as to the direction of data and thus, which company likely used it. The utility was provided by a former IBM employee, the code contains the word "downlink", it opens files locally with an "overwrite" attribute, and it acts as a terminal emulator to send remote system commands. To me, these all point to the utility being used by the "downloader", so the IBM development team, in order to grab files directly from Microsoft's minicomputer.
20HAL operates as follows:
The code is configured for 300 baud, but with the change of a single byte, it will work at 9600 baud. 20HAL relies on two programs residing on the DECSYSTEM in order to do the actual file transfer. Code for those programs are not known to exist but may be similar to ones from ITS or CompuServe (which ran on the PDP-10). There doesn't appear to be much in the way of blocking or error correction.
20HAL can be invoked with the name of a text file on the command line. Each line of the file is interpreted as a download command, so a convenient way to bulk download files.
When initializing, 20HAL sends two command strings to the remote system. The first is "^C^CPJ". ^C (ETX) breaks into the TOPS-10 monitor (basically a wake-up command to get a prompt) and PJ is short for "PJOB", which prints the terminal login banner. The second string is "SET TTY FILL 3" which controls what happens after a control character is received. It's like a kind of character pacing (delay) setting. 20HAL then prints a string of 20 blocks and waits.
At this point, 20HAL can be used as a terminal emulator, although there's no local echo. If you hit ^E (ENQ), the user is presented with a star-prompt. You can hit ^U at any time to cancel the command and start at a fresh prompt.
The transfer command format is [drive:]destfile.ext=remotef.ext-{a/b/r}. Some notes:
filenames are automatically space-padded to 8/3 for use by DOS file control blocks.
drive is optional but all other parameters are required.
only three file types are allowed: MAC (Macro Assembler source file), COM (executable file), and REL (a relocatable executable file). The final parameter {a/b/r} selects the file type and is used to send the correct command string to the PDP-10.
There are hints of a full-disk transfer capability, but I haven't been able to get that to work.
Once the command is fully parsed, 20HAL does three things:
creates destfile.ext on the local system (checking for disk full; automatic overwrite)
sends a remote command to the DECSYSTEM in order to copy the requested file to the home directory of the logged in user (COP DISK=remotef.ext). It's not known how these files were stored on the DECSYSTEM.
send a second remote command to the DECSYSTEM to initiate the file transfer. Two different programs were used, one for plain-text (CPMTTY) and one for binary (CPMLOD). Information is sent in 128-byte blocks which matches the default CP/M SSSD record size, but there doesn't appear to be any error correction. I immediately thought XMODEM (based on the time period), but it isn't. It seems more like XON/XOFF but with possibly different codes.
When completed, control is returned to terminal mode unless running a script, in which case it loops through the script until done.
What I have not yet been able to figure out is the file transfer protocol. I have been able to successfully transfer a single block of data, but it relies on either pauses or some handshaking -- something that can't be accomplished purely through HyperTerm. I did try to write a short program in QBASIC to act as the server, but that didn't work well.
I did set up a TOPS-10 system on SIMH, and I have several acoustic modems and a PBX, so at some point I can get a better demo working.
Daily Log
Below is more of a day-to-day log of what I discovered, but I did not go back and correct earlier entries as I learned more. I would really like to find someone who has used this program, but that's obviously a very limited number of people, most of whom are probably now retired. I would also like to get the PDP-10 file transfer code to see how that worked.
7/17/2021
After a few months of having this project on the shelf, I embarked on taking the source code and seeing if I could produce binary-identical output. I grabbed a copy of IBM Macro Assembler 1.0 (probably what was MASM 1.0), figuring that it would be the closest thing to what was used at the time. After a day or two of noodling around with the syntax, I still could only get the code about 90% there. Then, it dawned on me that, more than likely, they would have just used the SCP Assembler from 86-DOS since it produces a fairly clean non-segmented binary. Duh! Time to fire-up an MS-DOS VM and see if it works.
The SCP syntax is different from MASM (sort of like NASM versus MASM), so I needed to spend time converting the Sourcer-produced file to be compatible with ASM. After those changes, the first pass through ASM resulted in about 50 errors, mainly because of stuff I missed. After fixing those errors, I was able to get a clean compile with no errors. Yay! Most of the problems relate to forcing a word-wide op code rather than the byte-wide version, which were easily fixed, and a few other quirks like colons after labels and invalid characters (like "_"). The one remaining problem relates to operand size in a single line -- it compiles but produces the incorrect byte sequence. For example, code just after the loc44 label is:
mov [bx+1],ch
In the original program, this codes as 88/AF/01/00 yet when recompiled, it comes out as 88/6F/01. Even if I use the "W" (word) modifier, it doesn't change the output. Arrrgh. One of my friends from the VCFE board mentioned a feature of the SCP assembler in which you can force a 16-bit reference by using a forward equate that's not "near" (so -127 or +128 from the PC). So, the above would be...
mov [bx+ONE],ch
...and then at the bottom of the source file I added:
ONE: equ 1
That fixed it! Not sure if that's how it would have actually been coded, but at least it causes ASM to emit the right bytes. There is one additional similar mis-coding in the FlushBuf routine (cmp cl,0) which requires using the same forward equate trick (cmp cl,ZERO) to get the right bytes (80/F9/00 rather than 82/F9/00).
Here is the source and listing file for using SCP ASM.
4/2/2021
I decided to continue experimenting with this using acoustic coupler modems and several other pieces to simulate a dial-up setup that might have been used. As of now I only have a "remote console" working under SIMH, but it's a start. The basic setup is as follows:
/--------/ /--------/
| | | | { SIMH }
| | <--> [:::] //~~~~~~~~~~~~~~~~~// [:::] <----> | | <----> { TOPS10 }
/-------/ /-------/ { }
....... .......
....... .......
SERIAL MODEM 1 TELCO Panasonic Modem 2 LSI ADM 31 HP LAPTOP
TERMINAL PBX and two DM500 TERMINAL SIMH
telephones
Modem 1 is an Andersen Jacobsen A242-A 300-baud originate-only acoustic coupler modem. Modem 2 is a Nixdorf-branded Novation Cat 300-baud originate/answer acoustic coupler modem. The LSI terminal is a Lear-Siegler ADM31 terminal with a terminal pass-through connector. This is important because Modem 2 is connected to the "Extension" connector and "Modem" is connected to the computer running SIMH.
I have not been able to get SIMH to properly run the DZ terminal multiplexer with real serial ports at 300 baud using "ATTACH DZ -am LINE=0,CONNECT=SER2;300-8N1" (to put a real serial port on the DZ). So, I resorted to configuring the OPR terminal to use a real port using the SET CONSOLE command in SIMH:
simh>SET CONSOLE SERIAL=SER2;300-8N1
With this, I can type commands on the serial terminal and interact with the simulated PDP-10 over the modem connection. So, that's a good start.
<END>
The Gory Details
I would note that for system testing, I used a simulated PDP-10 running TOPS-10 using a pre-built TOPS-10 image.
There are certain code sequences which look very much like CP/M 2.2 BDOS system calls. An example:
9663:0584 print_char:
9663:0584 53 push bx
9663:0585 52 push dx
9663:0586 51 push cx
9663:0587 50 push ax
9663:0588 9C pushf
9663:0589 B1 06 mov cl,6
9663:058B 8A D0 mov dl,al
9663:058D E8 FA75 call cs:5 ;CALL 0005 $-588h
9663:0590 9D popf
9663:0591 58 pop ax
9663:0592 59 pop cx
9663:0593 5A pop dx
9663:0594 5B pop bx
9663:0595 C3 retnWhen looking at the disassembly, offset 058Dh came out as a negative offset to the PC. When doing the math, it results in "5". That was the clue. CP/M system calls used the C register as the system call number and then called BDOS (CP/M's equivalent of the "DOS" part of PC DOS) through a call to a fixed address in what Digital Research referred to as "Low Storage" (CP/M's equivalent of the Program Segment Prefix in PC DOS; "CALL 0005"). If you examine the layout of the PSP for PC DOS, you can see that this mechanism was carried over, presumably for backward compatibility for programs that may have been translated to x86 code using TRANS86.
Based on this, I believe that the program was likely developed on a CP/M system -- probably not CP/M-86, but regular Z80 CP/M -- and then translated for use on the PC.
The code also makes calls to the PC BIOS for video (INT10h) and serial (INT14h) needs, interleaved with the unchanged calls to DOS using CALL 0005. The CP/M-style calls that were left unchanged were primarily the file access and FCB calls (set DMA, open, close, read, write). The entry code to DOS from the CALL 0005 twiddles the registers and then jumps to the dispatcher, so there was no real need to translate these to equivalent INT21h calls.
Moving further through the code, I did not notice any data that would be either an AT modem string or a phone number (Boca Raton at the time was area code 407). The Hayes Smartmodem 300 wasn't introduced until after the PC was introduced, but the Hayes Micromodem 100 was available on the S-100 platform, being introduced in 1979. The lack of anything that looks "smart" leads me to believe that access was through a leased line from Bellevue to Boca, or maybe an acoustic coupler.
I have run 20HAL both stand-alone and using Scroll Systems Tracer while connecting the test PC to a Hewlett-Packard 4952 protocol analyzer. The format of the command line isn't clear, and it sends a string "TTY FILL 3" and waits for a response. The code is also "protocol-less" (no embedded XMODEM or other file transfer protocol that I could find). Based on this, I'm guessing that file transfers were probably limited to text-only, possibly an Intel HEX file. In CP/M, the output of ASM was an Intel HEX file and a PRN listing file. Thus, a HEX file would make sense - it can be turned into a binary by using the HEX2BIN utility also included on the diskette.
There are several questions I have that would aid in further analysis:
What was the development system for it? Seattle Computers didn't have a Z80 board, but the monitor code for the Gazelle refers to Northstar-, Tarbell- and Cromemco-branded floppy disk controllers (all of whom had Z80 CPU cards). The PC DOS pre-release disk had a copy of Seattle's ASM.COM and there was a note about DOS being built with it. I agree with this -- I have a copy of the MS-DOS 2.0 OEM Adaptation Kit for the Gazelle and modified IO.SYS code is built with ASM.COM. I know from Tim's blog that the system was a special system with above-normal RAM -- something possible because there was no video board and no BIOS; only a system monitor at 0xFF800. But, the CP/M footprints in the 20HAL code hints at possibly another system. I know that Microsoft kept an SCP system with 1MB of RAM (the maximum) around for many years because it was still used to "link the linker" as Tim put it (I'm inferring it was used to build versions of Microsoft's development tools).
What connection mechanism was used between the sites? The lack of any intelligent modem commands points to a leased line or acoustic coupler. For a PSTN connection, was it direct to Boca or through a packet network service like Tymnet?
How was it used, and by whom?
What was the data format? The lack of any sort of obvious file transfer protocol points at ASCII-only.
9/27/2020
There is a string that refers to three possible file extensions: MAC (MacroAssembler source file); COM (executable command file); and REL (CP/M relocatable object module). Not sure if these refer to acceptable input file formats. MAC is text-only; the others are binary.
There are two strings which have "CPM" in them: "R CPMTXT" and "R CPMLOD". "R" is the "run" command in TOPS-10, executing the two file transfer programs.
There is a string comparison to "DISK",0Dh. Wondering if there was a full-disk transfer capability...
Regarding file formats, if COM and REL files are acceptable files to transmit, then I would expect to see a file transfer protocol of some sort; not sure you can just send straight binary across the line. But, it could be possible since it would likely be a point-to-point communications link.
The first string sent to the serial channel is ETX,ETX,"PJ",0dh; the second is "TTY FILL 3". The latter command appears to be a TOPS-10 command (SET TTY FILL 3) that sets the number of fill characters to be sent to the terminal after the receipt of certain control codes (look at page 2-275 of the TOPS-10 Operating System Commands Manual, DEC AA-0916F-TB). I can't see there being much need for this command in a stand-alone PC program...unless...the development machine was connected to a terminal connected to a DEC minicomputer running TOPS-10, i.e., an RS-232 pass-through on a VT-x terminal. This would point to no separate dial-up connection, but instead some other connection.
It then runs through a bunch of subroutines and then sends the command " COP DISK=". It also seems to have a block counter.
10/3/2020
As an experiment, I took two "vintage" laptops (those with floppy drives and real RS-232 serial ports) and connected them together with a NULL-modem cable to simulate a cross-country connection.
Both machines were booted with identical MS-DOS 5.0 diskettes, each with the 20HAL program.
Upon running 20HAL on both machines, each one prints out "TTY FILL 3" which, as noted above, is a TOPS-10 command. I really can't imagine IBM having a DECSYSTEM-20 in Boca Raton because DEC would have been "the competition".
In the code, I noted the apparent use of ASCII control codes for handshaking: ^E (ENQ) and ^F (ACK). Hitting ^E on either machine activates something that looks like command mode -- a "*" prompt is presented and will accept commands. Any of the commands I've tried (from looking at the text in the code) gives a "Command Error". Hmmm.
For a second test, the second machine ("Boca Raton") was running a terminal emulator called Procomm. Upon starting, four characters are printed (ETX,ETX,"PJ") and then the TTY FILL 3 overwrites it. The first machine ("Bellevue") prints the banner and then a horizontal bar of blocks, almost hinting at a command length.
If you type on the Boca machine, the Bellevue receives the text. So, it operates like a remote console like the old CP/M MODEM3 (remote console). I tried a few commands, but no response. Hitting ^E locally in Bellevue produces the * prompt but again, no command incantation works.
I did try a simple XMODEM upload from Boca to Bellevue but there is definitely no auto recognition of it on the Bellevue side.
I definitely need to dive into the command strings more. Now that there's communication between the test machines, that's good.
10/4/2020
There seems to be very little difference between running 20HAL under DOS 3.11 and DOS 5.
When looking at the code @ cs:0519 (part of the command parser) it looks like there are three kinds of file commands: A (text/MAC), B (COM), and R (REL). Upon getting that command, it seems to setup a command string like "R CPMTXT" for a MAC file. It then sets the file extension in the FCB to the right type for the command. This routine returns the transfer command in AL, the command string in BX, and the extension in DX. Interestingly, this routine uses an unbalanced stack to continue execution further down. So, the execution address is pushed but there's no pop -- the RET instruction pops it into IP and continues execution there. I couldn't find a reason why it was done this way. It does save a few bytes, but not really enough to be worth it.
In addition to this, there is a "whole disk transfer" command. It doesn't seem to loop through the disk directory, so not sure how it's doing it. Maybe block-by-block?
10/8/2020
Knowing that a PDP-10 is involved, I've built a little PDP-10 setup using SIMH and "TOPS-10 In A Box" to simulate the PDP-10 on the Microsoft campus. I've also had to learn a bit about TOPS-10, which is not an easy task...there's many volumes of manuals spanning, no kidding, probably 3,000 pages or more.
I discovered that "^C^CPJ" breaks into the TOPS-10 monitor and "PJ" is short for "PJOB" which prints the user and terminal name on the screen. There doesn't appear to be any login sequence, reaffirming the thought that the test machine was connected to the PDP-10 through the pass-through serial connector on a terminal or directly to a DZ terminal concentrator. There is a way to simulate this on a PC -- hopefully -- with two COM ports and a program called Termmite. Termite is a software-based serial sniffer which can pass data across two ports and do all sorts of filtering and scripts.
I also figured out that the "r" command is the TOPS-10 "run" command. Unlike CP/M or MS-DOS where the "execute" is implied in typing a command without regard to it being an internal or external command, TOPS-10 requires the "r" for external commands. So, CPMTXT and COMLOD are external TOPS-10 programs which, of course, I don't have. They're not standard.
Here's an interesting segue into computer history, and may change my operating assumption about what 20HAL is.
There was talk on one of the PDP-10 message boards about ITS, the "Incompatible Timesharing System" created by the MIT AI Lab which ran on the PDP-6 (and later, the largely compatible PDP-10). This discussion talked about how some code in the Walnut Creek CP/M Archive was found on ITS tapes, likely because it had been hosted on an ITS system at one point.
Scrounging around the archive, I found a few CompuServe programs that looked like simple CP/M upload and download programs which use the CompuServe A protocol (which, naturally, pre-dated the more common
"CompuServe B" protocol). Guess what...CompuServe used PDP-10 machines around the time it was spun off from Golden United Insurance Company in 1975. So, these programs would have been around and modifiable.What I haven't figured out is if there's any data translation. The PC is an 8-bit system but the PDP-10 is a 36-bit system, so there is a native difference in word size. Thus, the two programs on the PDP-10 side may also need to do some data translation. If it's straight text, maybe not, but binary, for sure. This might be important if the PDP-10 is being used to ship bytes down to IBM rather than just storing them.
The next part of the experiment is simulating the original development environment using a PC for the development machine, a second PC with the Termite port sniffer (to act as the pass-through) and a third PC hosting SIMH. That should jam up the workbench for a while!
10/9/2020
To test my "pass-through" theory, I setup something that resembled the prototype (in the broadest sense of the word, and sorry for the junky-looking workbench):
The Thinkpad on the left is the 20HAL machine. The ADM31 terminal in the middle is, well, a terminal. The HP laptop on the right is the PDP-10. The three machines are connected by RS-232 connections at 300 baud. The PDP-10 on SIMH uses telnet, so I'm using a serial-telnet bridge program (it's actually a modem emulator for Commodore BBS'es, but it works). I know there's a lot of glare on the below picture, but you can see me "dialing" the PDP-10 using the standard Hayes modem comment, except that it takes an IP address and port. Once hitting ^C, I get the dot-prompt and I can log into TOPS-10.
The only problem with this setup is that I can't get the ADM31 to send characters properly so I can't "dial" the modem. So, I had to run Procomm to do it. But, if I exit Procomm to run 20HAL, the connection drops. If I shell to DOS from within Procomm, and then run 20HAL, I don't get the expected characters or response. Hmmm. I tested the ADM31 with HyperTerm so I know it works but maybe it's a quirk with how the pass-through port works. It's possible that the 20HAL machine was connected directly?
10/12/2020
I was able to get SIMH working directly with a USB-serial adapter, so now there's no need for the telnet bridge. The SIMH default is 9600 baud, and if I try to throttle it to 300, I get a message about making sure that the client OS is set for that baud.
At 9600 baud, I was able to get the ADM31 to work with TOPS-10, so that's good. Running at 300 baud didn't seem to work, even though I set the line configuration in SIMH to "300-8N1" and I did "set tty speed 300".
If using pass-through, I've found that you have to keep the 20HAL disconnected until TOPS-10 fully comes up. Once TOPS-10 is up, I can connect the 20HAL machine and, using Procomm, I can communicate through to TOPS-10 although it doesn't show-up on the ADM31 screen. Again, that's possibly how it's supposed to work, but I'm not 100% sure. Also running 20HAL seems to stall, so I wonder if the handshaking is messed up when using the pass-through.
Even though SIMH should work at 300 baud, it doesn't, even with a terminal connected directly to it. I decided to patch 20HAL to work at 9600 baud (by changing the RS232 initialization byte in the COM file) so I don't have to worry about line speeds.
I did buy a VT100 modern "kit" that has a pass-through serial but uses a PS/2 keyboard and VGA monitor. I'll try setting the terminal type to VT100 once I get this working.
10/13/2020
Made lots of progress, resulting in me changing my thesis on how 20HAL was used. I now believe it was used on the IBM side to connect to Microsoft's PDP-10 to download code. It used the script capability (see below) to send one request for each desired file, and the server-side programs CPMTTY and CPMLOD were responsible for transferring the files. Since two external PDP-10 programs were used, that part will likely be forever lost.
The change in direction from "Microsoft sending" to "IBM downloading" is based on how the code is written, where the development code was stored, and who was running the project. When 20HAL opens a file for transferring, the local filename is opened in "overwrite" mode -- something you only do when receiving a file. The CP/M hallmarks, jumping around in the code, trick jumps by pushing an address on the stack and executing a return without balancing the stack just doesn't seem to fit with what I believe would have been IBM's straight-laced development practices. Further, writing it would have required knowledge of both CP/M and TOPS-10, neither of which I believe would have been common at IBM. So, I feel that it was developed by Microsoft.
One minor aside...the PC DOS 0.9 disk includes a program called "TTY.ASC" which is a terminal emulator with file transfer capabilities, written in PC BASIC. This program was not included on the released version of PC DOS 1.0. The date of the file is May 21, 1981, about a month after 20HAL (April 23, 1981). So, I wonder if TTY replaced 20HAL in the late stages of development once PC DOS and/or BASIC was closer to done.
20HAL has a "script" mode in that if you pass a filename on the command line, it executes each line in the file. This could be a convenient way to download a number of files sequentially. It doesn't appear that the name of the requested file is sent to the remote server, so I'm not sure how that works yet. If the script file has no extension, "B20" is assumed".
In the absence of hitting ^E, 20HAL works as a simple terminal emulator. Once you hit ^E, you get a "*" command prompt. ^U cancels the current command and jumps back to the "*" prompt.
The command line is [drive:]DESTFILE.EXT=SRCFILE.EXT-{A/B/R} where A, B or R represents the type of file being requested: A (MAC MacroAssmebler file), B (COM file) and R (Relocatable executable file).
Once the command line is received, it sends "COP DISK=SRCFILE.EXT/N" to the PDP-10. The "/N" is appended only with a MAC file. I'm unsure what this switch is for. The only reference to "/N" I can find is relating to the DTCOPY program (the DECtape copy program) which suppresses a directory listing.
"COP" could be short for "copy"; "DISK" could be for the "DSK" system device which is the disk area for the logged-in user (sort of like a home directory).
20HAL then waits in a loop for the TOPS-10 monitor prompt to return, signifying the copy command is done. Upon getting the prompt, it sends the correct remote file transfer command and then sits in a tight loop sending "B" and a bell character, presumably waiting for the transfer to begin. Data is transmitted in 128-byte blocks.
This can be cancelled by hitting "A". I feel it's odd using an ASCII character for this because it's not like "A" is uncommon. More work to do on that.
It's not clear yet if there is a file transfer protocol. It would seem odd to not have a resilient protocol for transferring files over open lines, even leased lines. Will need to compare the code to XMODEM and see if it looks similar. XMODEM uses 128-byte blocks which is also the same size of a CP/M disk record. It kind of fits. XMODEM was developed in 1977 by Ward Christensen, so it would have been well-known to developers at the time.
Copyright (c) 1998-2024 Richard A. Cini, Jr. (rcini at msn dot com) All Rights Reserved. All copyrights of any third parties referred to herein are hereby acknowledged. There is no warranty, either express or implied, relating to any of the content contained herein. The site maintainer shall in no event be liable to anyone for damages, including any loss of profits, lost savings, or other incidental or consequential damages arising out of the use or misuse of the information contained on this Web site. Batteries not included. You may use the information contained herein for NON-COMMERCIAL purposes only and AT YOUR OWN RISK.