Microsoft "20HAL"

Last Updated 04/02/2021 20:00 -0400   

 

This topic isn't directly related to my work with the Seattle Gazelle, but it is an important, although minor, part of the overall development process of PC/MS-DOS, especially as we approach the 40th anniversary of its introduction in 1981.

If you visit Dan, TheStarman's site at pcministry, there is a discussion about a June, 1981 pre-release version of PC DOS 1.0 that is sometimes arbitrarily called "IBM PC DOS 0.90". The discussion here is based on a disk image Dan received in 2006 from a former IBM employee which contained a near-final version of PC DOS and a bunch of development-related tools that never made it to the final disk shipped by IBM. The one I'm going to primarily discuss is "20HAL.COM" dated April 23, 1981, about four months before the announcement of the IBM-PC.

The "20" I thought might refer to TOPS-20 running on a PDP-10. Based on an email exchange I had with Tim Paterson in 2014, Microsoft stored the source on a PDP-10 but the code was actually assembled on a specially-configured Seattle Computer Products ("SCP") S-100 machine; there was no discussion of TOPS-20. "HAL" is also an interesting choice, with each letter being one off from IBM -- and/or an interesting parallel to a large, maniacal, all-knowing space station computer of the same name.

20HAL is a relatively short program (1.8k) that was used to send files from Microsoft in Bellevue, WA to IBM's Entry Systems division in Boca Raton, FL. To my knowledge, no one has examined it closely because, really, how important is it? It was a simple system utility to get DOS development files to IBM. To me, it poses an interesting digital archeological project.

I have so far not found anyone who knows how the utility was actually used, and there is no embedded help. So, necessarily, this requires a lot of conjecture based on what would seem logical in that environment in 1980. I'll summarize the results here, but you can scroll down for more of the day-to-day analysis. I had several questions:

To start, I ran the code through Sourcer to get a listing file, and I made extensive use of the Scroll Symbolic Tracer by Murray Sargent (itself a program with an interesting Microsoft-connected history). Looking at the embedded text strings revealed TOPS-10/20 commands and keyboard sequences to print login information, set control code filtering, execute a copy command, and run external programs. The listing is here.

The code had an odd mix of CP/M style calls (using "CALL 0005") for keyboard input and certain disk functions, calls to the PC BIOS for video and serial communications, and finally two isolated PC DOS INT21H calls for two disk functions. The hallmarks of CP/M, jumping around in the code, stack jumps by pushing an address on the stack and executing a return without balancing the stack just doesn't seem to fit with what I read about IBM's straight-laced development practices. Further, writing it would have required knowledge of both CP/M and TOPS-10/20, neither of which I believe would have been common at IBM. So, I feel that it was developed by Microsoft and given to IBM to use.

The CP/M-style calls and the names of the two transfer programs on the PDP-10 (both start with "CPM") would hint at it being originally written on a CP/M 2.2 system and then translated to 8086 using TRANS86. I believe that 86-DOS was setup to accept CP/M-compatible system calls as a way to enable users to migrate Z80-based CP/M programs to the 8086 platform; MS-DOS of course has the same capability. CP/M-86 wasn't released until three months after the PC so I doubt it was developed using that. SCP did not have a Z80 CPU card, but Gazelle-specific versions of both 86-DOS and MS-DOS could use floppy controller cards that were common in Z80 S-100 systems of the era. So, in a way, it kind of fits.

The next question related to how the two systems connected and how that was accomplished. When looking at the serial communications code, there are no phone numbers or "smart modem" command strings (the Hayesmodem 300 didn't come out until later in 1981, but there were S-100 modems; none were "smart", however, but they could take a "dial" string). Based on this, I feel that the link would have been direct in some form. The least expensive would be an acoustic coupler on either end, rather than a leased line...but...the core PC DOS source files were about 300K which, at 300 baud, would take over 2 hours to transfer. Using a 56kbps leased line, that would take only about 45 seconds.

As you get into the code, there are some small clues as to the direction of data and thus, which company likely used it. The utility was provided by a former IBM employee, the code contains the word "downlink", it opens files locally with an "overwrite" attribute, and it acts as a terminal emulator to send remote system commands. To me, these all point to the utility being used by the "downloader", so the IBM development team, in order to grab files directly from Microsoft's PDP-10 system.

20HAL operates as follows:

What I have not yet been able to figure out is the file transfer protocol. I have been able to successfully transfer a single block of data, but it relies on either pauses or some handshaking -- something that can't be accomplished purely through HyperTerm. I may try to write a short DOS program to act as the server, probably in BASIC because it's easier to modify.

Below is more of a day-to-day log of what I discovered, but I did not go back and correct earlier entries as I learned more. I would really like to find someone who has used this program, but that's obviously a very limited number of people, most of whom are probably now retired. I would also like to get the PDP-10 file transfer code to see how that worked.

4/2/2021

I decided to continue experimenting with this using acoustic coupler modems and several other pieces to simulate a dial-up setup that might have been used. As of now I only have a "remote console" working under SIMH, but it's a start. The basic setup is as follows:

 

/--------/                                                /--------/

|       |                                                 |       |        {  SIMH  }

|       | <--> [:::]  //~~~~~~~~~~~~~~~~~//  [:::] <----> |       | <----> { TOPS10 }

/-------/                                                 /-------/        {        }

.......                                                   .......

.......                                                   .......

 

SERIAL        MODEM 1    TELCO Panasonic    Modem 2        LSI ADM 31      HP LAPTOP

TERMINAL                 PBX and two DM500                 TERMINAL        SIMH

                         telephones

Modem 1 is an Andersen Jacobsen A242-A 300-baud originate-only acoustic coupler modem. Modem 2 is a Nixdorf-branded Novation Cat 300-baud originate/answer acoustic coupler modem. The LSI terminal is a Lear-Siegler ADM31 terminal with a terminal pass-through connector. This is important because Modem 2 is connected to the "Extension" connector and "Modem" is connected to the computer running SIMH.

I have not been able to get SIMH to properly run the DZ terminal multiplexer with real serial ports at 300 baud using "ATTACH DZ -am LINE=0,CONNECT=SER2;300-8N1" (to put a real serial port on the DZ). So, I resorted to configuring the OPR terminal to use a real port using the SET CONSOLE command in SIMH:

simh>SET CONSOLE SERIAL=SER2;300-8N1

With this, I can type commands on the serial terminal and interact with the simulated PDP-10 over the modem connection. So, that's a good start.

<END>

 

The Gory Details

I would note that for system testing, I used a simulated PDP-10 running TOPS-10 rather than TOPS-20, primarily because I was too lazy to build a TOPS-20 system from scratch, instead using a pre-built TOPS-10 image. I plan on doing a TOPS-20 install at some point, but I don't think it will affect the outcome.

There are certain code sequences which look very much like CP/M 2.2 BDOS system calls. An example:

9663:0584     print_char:
9663:0584 53         push bx
9663:0585 52         push dx
9663:0586 51         push cx
9663:0587 50         push ax
9663:0588 9C         pushf
9663:0589 B1 06      mov cl,6
9663:058B 8A D0      mov dl,al
9663:058D E8 FA75    call cs:5     ;CALL 0005 $-588h
9663:0590 9D         popf
9663:0591 58         pop ax
9663:0592 59         pop cx
9663:0593 5A         pop dx
9663:0594 5B         pop bx
9663:0595 C3         retn

When looking at the disassembly, offset 058Dh came out as a negative offset to the PC. When doing the math, it results in "5". That was the clue. CP/M system calls used the C register as the system call number and then called BDOS (CP/M's equivalent of the "DOS" part of PC DOS) through a call to a fixed address in what Digital Research referred to as "Low Storage" (CP/M's equivalent of the Program Segment Prefix in PC DOS; "CALL 0005"). If you examine the layout of the PSP for PC DOS, you can see that this mechanism was carried over, presumably for backward compatibility for programs that may have been translated to x86 code using TRANS86.

Based on this, I believe that the program was likely developed on a CP/M system -- probably not CP/M-86, but regular Z80 CP/M -- and then translated for use on the PC.

The code also makes calls to the PC BIOS for video (INT10h) and serial (INT14h) needs, interleaved with the unchanged calls to DOS using CALL 0005. The CP/M-style calls that were left unchanged were primarily the file access and FCB calls (set DMA, open, close, read, write). The entry code to DOS from the CALL 0005 twiddles the registers and then jumps to the dispatcher, so there was no real need to translate these to equivalent INT21h calls.

Moving further through the code, I did not notice any data that would be either an AT modem string or a phone number (Boca Raton at the time was area code 407). The Hayes Smartmodem 300 wasn't introduced until after the PC was introduced, but the Hayes Micromodem 100 was available on the S-100 platform, being introduced in 1979. The lack of anything that looks "smart" leads me to believe that access was through a leased line from Bellevue to Boca, or maybe an acoustic coupler.

I have run 20HAL both stand-alone and using Scroll Systems Tracer while connecting the test PC to a Hewlett-Packard 4952 protocol analyzer. The format of the command line isn't clear, and it sends a string "TTY FILL 3" and waits for a response. The code is also "protocol-less" (no embedded XMODEM or other file transfer protocol that I could find). Based on this, I'm guessing that file transfers were probably limited to text-only, possibly an Intel HEX file. In CP/M, the output of ASM was an Intel HEX file and a PRN listing file. Thus, a HEX file would make sense - it can be turned into a binary by using the HEX2BIN utility also included on the diskette.

There are several questions I have that would aid in further analysis:

 

9/27/2020

 

10/3/2020

 

10/4/2020

 

10/8/2020

 

10/9/2020

The Thinkpad on the left is the 20HAL machine. The ADM31 terminal in the middle is, well, a terminal. The HP laptop on the right is the PDP-10. The three machines are connected by RS-232 connections at 300 baud. The PDP-10 on SIMH uses telnet, so I'm using a serial-telnet bridge program (it's actually a modem emulator for Commodore BBS'es, but it works). I know there's a lot of glare on the below picture, but you can see me "dialing" the PDP-10 using the standard Hayes modem comment, except that it takes an IP address and port. Once hitting ^C, I get the dot-prompt and I can log into TOPS-10.

 

The only problem with this setup is that I can't get the ADM31 to send characters properly so I can't "dial" the modem. So, I had to run Procomm to do it. But, if I exit Procomm to run 20HAL, the connection drops. If I shell to DOS from within Procomm, and then run 20HAL, I don't get the expected characters or response. Hmmm. I tested the ADM31 with HyperTerm so I know it works but maybe it's a quirk with how the pass-through port works. It's possible that the 20HAL machine was connected directly?

 

10/12/2020

 

10/13/2020

 

That's it for now...more to come as I continue to dig.

 

Copyright (c) 1998-2021 Richard A. Cini, Jr. (rcini at msn dot com) All Rights Reserved. All copyrights of any third parties referred to herein are hereby acknowledged. There is no warranty, either express or implied, relating to any of the content contained herein. The site maintainer shall in no event be liable to anyone for damages, including any loss of profits, lost savings, or other incidental or consequential damages arising out of the use or misuse of the information contained on this Web site. Batteries not included. You may use the information contained herein for NON-COMMERCIAL purposes only and AT YOUR OWN RISK.