|
Regression testing is used to probe for the presence of induced bugs in every step along the way to proving our cross-tools. Prior to creating our cross-compiler, we generate our early test files off of a known good and tested implementation (in the case of 386BSD, a Sequent 386 UNIX system). The compiler output for some unmodified portions of the compiler and the kernel of the operating system are kept as reference assembly language files, for comparison against subsequent compiler versions output compiling the same files. An induced error would cause a difference to show up in the comparison of the two. As an example of this, a whole group of instructions might be missing, signifying a dropped expression left uncompiled by a buggy compiler. In a similar fashion, a group of object files from the assembler are also created to compare with those created by the assembler on our 386. In addition to this set of test files, a record is kept of every kind of induced bug and the source code which generated it. Thus, common bugs which are inadvertently reintroduced periodically can be caught without needing to be debugged a second time (or a third ... ). This mechanism for tracking compiler bugs is not a panacea--it is vulnerable to error in two major ways: It does nothing to aid detection of "latent" bugs in the "good" version we started with; and it becomes useless if modifications to the compiler result in widespread changes in the output code, thus obscuring "bug" changes. However, it proved adequate for the short period (one to two months) it took to reliably compile code in native 386BSD. "Divide and conquer" is used to isolate the effect of multiple bugs appearing as a single impossible-to-find bug. It is a very powerful tool for use in certain unpleasant predicaments. For example, during the 386BSD project, we detected the presence of a kernel bug, a compiler bug, and a library bug all hitting at the same point, at a time when we did not yet have an operable debugger to sort out the mess. After isolating the problem with blitheringly primitive printfs, we tried porting similar, related programs, until we found a program that isolated the library bug and the compiler bug at separate times. Once we fixed these bugs, we recompiled the entire set of kernel and applications programs. The remaining kernel bug was then obvious to see and correct. Divide and conquer allowed us to solve an "unsolvable" problem. Consistency checks are implemented in the drivers and trap/system call handlers to detect "impossible" conditions, such as returning to a user program with interrupts off, a completely invalid user stack pointer, and so forth. At one point, we even had them in library code and inline to the C compilers assembly language output. Throughout the 386BSD development cycle, consistency checks provided a mechanism to detect a problem before it became terminal and untraceable. For example, when we converted 386BSD from 4.3BSD-Tahoe to 4.3BSD-Reno, consistency checks detected a disastrous problem caused by a side effect of the context-switch code. Consistency checks have their downside, however. Performance degrades with the use of consistency checks in speed-sensitive areas such as system call handling. Resist temptation, however, and don't take them out just for convenience. Otherwise, mysterious problems will reappear and drive you crazy. Another type of seemingly benign tinkering which results in disaster comes when one tries various performance optimizations too early in the game. We ran into problems every time we tried jumping ahead by improving our early development code before it was fully reliable. It is better to "comment out" performance improvements, compiler optimization, and "short circuit" code evaluation, until the code and compiler are somewhat shopworn. It is very frustrating when you have found a mechanism for a section of code that might improve performance by an order of magnitude or more, but only at the risk of upsetting the kernel operation itself. Be wary of such improvements--patience is definitely a virtue in a systems project.
|