Jolix
/joh'liks/ n.,adj. 386BSD

PORTING UNIX TO THE 386: A PRACTICAL APPROACH


William & Lynne Jolitz


We stepwise proved out the running environment of program tools, program loading and execution, trap handling, stack, and support of high level language. Our project moved from fragile to substantive.




The First Step
At this point, there is little confidence in any of our tools because we have yet to actually "shake down" the absolute loader, assembler, and link editor. Beginning with trivial programs of a few instructions and gradually expanding them, we incrementally prove our tools to the point where we can use them with some degree of confidence. The journey begins, as always, with a single step.

Listing 1: Simplest protected mode program providing output
# File "hi.s":
   .text
start:
   # put "hi" mid screen on display
   movl $0x0e690e48, 0x0b8800

   hlt
The program in Listing One is the simplest protected-mode program we can write that generates output on the screen. It displays the message "hi" midscreen and then stops. A program this simple must always work. If it does not, it presents a minimum number of possibilities to determine why it fails. During our port, this program originally did not work, due to bugs in the early loader and assembler.

While this may seem trite to some, this program illustrates the pathetic level at which untested software tools begin. After eliminating a handful of nuisance bugs, this simple program did work, and it proved valuable because it was able to smoke out bugs quickly.

DOS "MASM" vs UNIX "as": Assembler Syntax Difference
A side note to those who may have noticed that our assembly code format seems to have changed since the previous article when we used Microsoft's MASM: For those unaware, UNIX 386 assemblers prefer the operands in the opposite order, partly because early UNIX systems appeared on PDP-11s, which preferred this ordering style. Thus, on MS-DOS with MASM:
  mov  eax,edx ;move 32 bit edx into eax register
corresponds to the UNIX assembler format:
  movl  %edx,%eax #move 32 bit edx into eax register
In other words, it is (destination, source) instead of (source, destination). This is yet another stunning "improvement" in the field of computer languages, destined to be appreciated by those simultaneously debugging a MASM-coded bootstrap loader and code generated by the GAS UNIX assembler!
Since we go back and forth from our DOS environment to our standalone BSD UNIX protected mode one, we prove out program, input/output, stack, and exception handling as we climb in level of abstraction from machine to assembler to high level language program operation, hopefully keeping the noemenclature of DOS vs UNIX straight.

Listing 2: Standalone "Hello World" program
# File "hello.s":
#  Minimal test of GNU GAS assembler,
#  handles CGA & strings.
   .text
start:
   movl   $0xA0000,%esp

   pushl   $str
   call   _puts
   pop   %eax
   hlt
str:
   .asciz   "\n\rHello world from GAS\r\n"

_puts:
   push   %ebx
   movl   8(%esp),%ebx
1: cmpb   $0,(%ebx)  # until we see a null
   je   2f

   movzbl  (%ebx),%eax
   pushl   %eax
   call   _putchar   # put out characters
   popl   %eax

   incl   %ebx
   jmp   1b
2: popl   %ebx
   ret
crtat:
   .long  0xb8000    # address of CGA video RAM
row:
   .long  0

_putchar:
   movzbl 4(%esp),%eax
   push   %ebx
   push   %ecx
   movl   crtat,%ebx

   # continous output off screen edge & bot
   cmpl   $0xb8000+80*25*2,%ebx  
   jl     1f
   movl   $0,row
   movl   $0xb8000+80*(25-1)*2,%ebx
1: cmpb   $0xd,%al    # cr
   jne    1f
   movl   $80,%ecx    # clear rest of line
   subl   row,%ecx
   movl   %ebx,%edi
   movw   $0xfff,%ax
   cld
   rep
   stosw
   subl   row,%ebx
   subl   row,%ebx
   movl   $0,row
   jmp    9f
1: cmpb   $0xa,%al    # nl
   jne    2f
   cmpl   $0xb8000+80*(25-1)*2,%ebx   # scroll?
   jl     1f
   movl   $0xb8000,%edi      # scroll page
   movl   $0xb8000+80*2,%esi
   movl   $80*(25-1),%ecx
   cld
   rep
   movsw
   movl   $80,%ecx    # clear new bottom line
   movl   $0xb8000+80*(25-1)*2,%edi
   movw   $0,%ax
   rep
   stosw
   sub    $80*2,%ebx   # position cursor before lf
1: add    $80*2,%ebx
   jmp    9f
2: orw    $0x0e00,%ax  # attribute
   movw   %ax,(%ebx)
   addl   $2,%ebx
   incl   row
9: movl   %ebx,crtat
   pop    %ecx
   pop    %ebx
   ret
As we proceed further, we add more complexity, testing span-dependent jumps, stacks, and other mechanisms. Listing Two is a more elaborate program which sends character and string output functions to the screen, thus allowing for a primitive degree of debugging.

Listing 3: Standalone runtime and C "Hello World"
# [Excerpted from file: srt.s]
 ...
entry:   .globl   entry
   jmp   1f
   .space   0x500   # skip over BIOS data area
1:   cli            # no interrupts yet

   movl   $0xA0000,%esp

   movl   %esp,%edx
   movl   $_edata,%eax
   subl   %eax,%edx # clear stack and heap store
   pushl   %edx
   pushl   %eax
   call   _bzero
   popl   %eax
   popl   %eax

   call   _main
 ...

/* File: hello.c */
main() {   printf("Hello, world!\n");   }
Listing Three contains a simple runtime start-off for C, with the obligatory "hello world" program heralding our arrival into serious programming mode. At this point, we've found most of the silly bugs and also created a primitive debugging tool. One might even claim that, through this method, our entire BSD UNIX system is derived from our original two-instruction program that we started with!

We can't stress enough the need to "prove out" the running environment prior to handling larger efforts. Its much easier to gradually bring up an absolute programming environment this way rather than attempt to work through compound errors - especially on a machine that will hand you a processor restart fault, dump you back in the BIOS without a hope in hell of knowing where it came from.


<<BACK NEXT >>



Copyright 1989, 1990, 2006 TeleMuse Partners, William Jolitz and Lynne Jolitz