A number of features of the Basmark QuickBASIC compiler need describing. Although the QuickBASIC Programmer’s Manual in general is designed for reference, some topics are not adequately treated by this format. This document means to collect the various components in the several areas into coherent discussions for the purpose of instruction. This document does not attempt to describe these topics exhaustively because the instructive qualities would suffer.
The author would be foolhardy to attempt to describe the fundamentals of QuickBASIC here since the topic has been discussed more eloquently by others. For a basic QuickBASIC tutorial, the reader may choose from the many available works. Rather than compete with these works, this document focuses on the features of Basmark QuickBASIC which are not as thoroughly discussed in independent publications.
With this in mind, a tutorial is presented in several topic areas: structured programming, dynamically allocatable arrays and input/output.
For the sake of discussion, suppose we are interested in finding the roots of a quadratic equation, that is the values of x such that,
One might code these solutions as:
print (-b + sqr(b * b - 4 * a * c)) / (a + a) print (-b - sqr(b * b - 4 * a * c)) / (a + a)
def fnd(a, b, c) = b * b - 4 * a * c def fnx(a, b, d) = (-b + d) / (a + a) def fnroot1(a, b, c) = fnx(a, b, sqr(fnd(a, b, c))) def fnroot2(a, b, c) = fnx(a, b, -sqr(fnd(a, b, c))) print fnroot1(a, b, c), fnroot2(a, b, c)
def fnroot1(a, b, c) c = fnd(a, b, c) if c < 0 then exit def fnroot1 = fnx(a, b, sqr(c)) end def
As with the single-line user-defined function, values are “passed by value,” that is, copies of the values passed from the calling routines are used. Thus, the assignment to “c” in the first line in the example above does not effect the value of “c” in the PRINT statement. Also, the parameters in the function definition are private to the function, so we could have assigned values to a, b and c within the function without fear of clobbering a, b or c outside the function.
We are welcome to use other variables within a user-defined function but these are not private copies, so we must be careful not to clobber values needed outside the function.
The previous example avoids an error with the square root function, but it does not report the problem in a meaningful way. The preferable solution would be to report not only the roots, but also the number of roots. A quadratic equation has zero, one or two roots depending on whether b² - 4ac is negative, zero or positive. It would be most useful to have a function which returned the number of roots as its value and set the appropriate root values as well:
def fnd(a, b, c) = b * b - 4 * a * c def fnx(a, b, d) = (-b + d) / (a + a) def fnquad(a, b, c) c = fnd(a, b, c) fnquad = 0 if c < 0 then exit def c = sqr(c) root1 = fnx(a, b, c) fnquad = 1 if c = 0 then exit def root2 = fnx(a, b, -c) fnquad = 2 end def n = fnquad(a, b, c) if n > 0 then print root1 if n > 1 then print root2
Although the previous example is good, it suffers from the need for special variables which, inevitably, leads to confusion about the side effects of the function. After all, if we were to interject code between the call of the function and the use of root1 and root2, it would become unclear that these variables were set by the function.
We would like to be able to pass variables to the function whose values would be set upon return. It would be relatively clear that the function had an effect on these variables and we would not be constrained to use particular names like root1 and root2. Fortunately, subprograms have this characteristic:
sub quad(a, b, c, nroot, root1, root2) static static d d = fnd(a, b, c) nroot = 0 if d < 0 then exit sub d = sqr(d) root1 = fnx(a, b, d) nroot = 1 if d = 0 then exit sub root2 = fnx(a, b, -d) nroot = 2 end sub call quad(a, b, c, n, r1, r2) if n > 0 then print r1 if n > 1 then print r2
In addition, we have declared “d” in a STATIC statement so we need not be concerned about the “d” within the subprogram clobbering the value of “d” outside the subprogram. Declaring a variable in a STATIC statement creates a private, local variable regardless of what appears outside the subprogram. Actually, this declaration is usually redundant because without any declaration variables used within a subprogram are local variables, different from variables with the same name outside the subprogram. Unfortunately, as we shall see later, this default can be overridden by code outside the subprogram, which makes us uncomfortable. The reason we write subprograms in the first place is so our code will be safe from the hectic world outside. So to be perfectly safe, the “subprogrammer” must use the STATIC statement. Nevertheless, for a quick job we might omit the STATIC statement and get away with it.
At other times, we might need a value from outside a subprogram. As we now know, by default variables used within a subprogram are local to the subprogram. We can override this default and gain access to a “global” variable with the SHARED statement, another declarative statement. In the subprogram above, if we had declared SHARED instead of STATIC:
shared d
The form of a subprogram is always like that given in this example. The header has the word SUB followed by the subprogram’s name followed by the list of arguments (or parameters) in parentheses and finally the word STATIC. If there are no arguments to the subprogram, the parentheses are also omitted. The subprogram ends with END SUB, just as a multi-line user-defined function ends with END DEF. Notice the use of EXIT SUB to terminate execution of the subprogram: it is analogous to the EXIT DEF in the multi-line user-defined function.
The previous example is good but it would be better if the subprogram could return the number of roots as its value, as we did in our last example user-defined function, fnquad. Unlike other structured QuickBASIC languages, Basmark QuickBASIC allows a value to be returned from a subprogram:
sub usrquad(a, b, c, root1, root2) static static d d = fnd(a, b, c) usrquad = 0 if d < 0 then exit sub d = sqr(d) root1 = fnx(a, b, d) usrquad = 1 if d = 0 then exit sub root2 = fnx(a, b, -d) usrquad = 2 end sub n = usrquad(a, b, c, r1, r2) if n > 0 then print r1 if n > 1 then print r2
In the example above we passed simple variables to the subprogram. Three of these (a, b and c) represented “input” parameters, variables used only for their values and two others (r1 and r2) represented “output” parameters, variables used as storage locations. The difference is important and is determined only by convention and may vary from subprogram to subprogram. Any expression may be used as a parameter to a subprogram but what happens when these parameters are misused and how do we avoid these problems?
Suppose our subprogram had been miscoded and set the value of “c”, altering the value of “c” in the calling code. That would be a difficult bug to track down since we would tend to look for the bug in the calling code rather than the erroneous subprogram. We can protect our input parameters with parentheses:
n = usrquad((a), (b), (c), r1, r2)
An output parameter from a subprogram ought to be a storage location, a simple variable or a reference to an array element. Suppose our calling code had been miscoded and used an expression like “x+y” or “42.0”. These are not storage locations at all. Fortunately, expressions like these are also copied to a safe place and the copy made available to the subprogram. So while there is little to recommend passing a constant as an output parameter location to a subprogram, there is no harm in it.
Suppose we have a job which requires array multiplication. It would be most convenient to have a single subprogram to do the job and test that the array dimensions are appropriate:
if usrarrmul(c(), a(), b()) then print "Bad dimensions"
The following example illustrates array multiplication:
sub usrarrmul(c(2), a(2), b(2)) static static i, j, k, sum usrarrmul = -1 if ubound(a, 1) <> ubound(c, 1) then exit sub if ubound(a, 2) <> ubound(b, 1) then exit sub if ubound(b, 2) <> ubound(c, 2) then exit sub usrarrmul = 0 for i = 1 to ubound(c, 1) for j = 1 to ubound(c, 2) sum = 0 for k = 1 to ubound(a, 2) sum = sum + a(i, k) * b(k, j) next k c(i, j) = sum next j next i end sub
A new function UBOUND is illustrated here; it returns the upper bound of the array given by the first argument along the dimension given by the second argument. For single dimension arrays, the second argument may be omitted. With this function, a subprogram may be written which is not dependent on a particular array size and without the need for special arguments to indicate size.
For those familiar with other programming languages, particularly C, it is often convenient to write specialized code in other languages. With Basmark QuickBASIC, it is possible to interface QuickBASIC with other languages. Some languages, such as FORTRAN may require an intermediary routine written in C or assembler, but these latter two languages can always be interfaced directly with QuickBASIC. In fact, Basmark QuickBASIC subprograms are defined in terms of the C programming language so truly portable bilingual programs can be written in these languages.
For example, the usrquad subprogram described earlier could just as easily be written in C:
#define FND(a, b, c) ((b) * (b) - 4 * (a) * (c)) #define FNX(a, b, d) ((-(b) + (d)) / ((a) + (a))) float FLTquad(a, b, c, root1, root2) float *a, *b, *c, *root1, *root2; { double d, sqrt(); d = FND(*a, *b, *c); if (d < 0.0) return (0.0); d = sqrt(d); *root1 = FNX(*a, *b, d); if (d == 0.0) return (1.0); *root2 = FNX(*a, *b, -d); return (2.0); }
For large programming projects, the size of the program can become a bit unruly: the program takes longer to load in editors, the compilation time becomes unreasonably long (even if much of the code has not been altered since the last compilation) and the program generally suffers from a lack of organization. To cope with these difficulties, we can organize the code into a series of coherent files (or modules). The modules are compiled separately so only those modules which have been modified need re-compilation. Finally a loader is used to produce an executable program from all the compiled modules. This scheme is well-supported by the UNIX system and the Basmark QuickBASIC compiler so the programmer need only a minimal awareness of the process— indeed the process is completely invisible in the single module scenario.
In any case, one module is designated the “main” module and we shall call the others “secondary” modules. The main module is the module where execution begins. The secondary modules, if any, can only be executed as a result of subprogram calls from other modules.
Suppose we have a main module and two secondary modules:
compute
|
input
|
output
|
The main module, named “compute”, calls subprograms in its own module and other modules. Subprogram calls are the only method by which code in other modules can be executed and here the method is used quite effectively. In fact, some programming theorists would be quite pleased with this structure.
The QuickBASIC compiler must be informed which module is the main module. The methods of doing this are three. The simplest is to list the main module first on the command line:
basic compute.b input.b output.b
Of course, specifying all the modules on the command line does not allow for separate compilation, one of the principal advantages of multiple modules. We would prefer to compile the modules separately and then load them together:
basic -c compute.b basic -c -M input.b basic -c -M output.b basic compute.o input.o output.o
The final method for informing the QuickBASIC compiler of the main module is illustrated in the diagram above. It is often preferable, and more straightforward, to use the same command line for all modules (rather than fool with special switches) and leave the identification of the main module to the code of the main module itself. A special meta-command, $MAIN, may be placed at the beginning of the main module, as shown in the diagram above. This meta-command overrides the meaning of the “-M” switch so that the “-M” switch can appear on the command line for all modules.
This method, placing the $MAIN meta-command at the beginning of the main module, is especially convenient in conjunction with “make”, the program maintenance program. This UNIX utility uses file modification dates and a file, named “Makefile”, of user supplied dependency information to direct compilation of a multi-module program. “Make” even has implicit rules for compiling modules for standard UNIX languages like C and FORTRAN. This utility is just what the doctor ordered to assist programmers in avoiding redundant compilation of unmodified modules and yet ensuring the production of an up to date executable program. So what more could you ask for? Well, an implicit rule for compiling QuickBASIC modules, of course:
.SUFFIXES: .b .b.o: ; basic -c -M $<
OBJS=compute.o input.o output.o a.out: $(OBJS) basic $(OBJS) input.o output.o: rep.i
In multi-module programs, each module has its own private error recovery routine and data area (DATA statements).
A unique property of the error recovery routine, even in secondary modules, is that this routine is not contained in a subprogram. Instead the error recovery routine lies outside of all subprograms. The “ON ERROR GOTO x” statement, of course, may be inside a subprogram. This allows each module to provide a routine to recover from errors that may occur in the module.
In the diagram in the preceding section, the input module contains a routine at line 100 for reporting errors in the input. However, if an error occurs in the compute or output module, such an error is not trapped and execution does not proceed with the input module’s error recovery routine. The other modules could, of course, have their own private error recovery routines.
In addition to the passing of parameters to subprograms, data may be communicated between modules with labeled and unlabeled COMMON blocks. Because the order of the items in a COMMON block is crucial, it is standard practice to place COMMON statements in an include file to ensure consistency. In the diagram in the preceding section, suppose “rep.i” contains the declaration of a COMMON block for reporting errors:
COMMON SHARED /report/ errnum, errname$
A unique COMMON block is the unlabeled COMMON block, which has no name. Declarations for this COMMON block omit the slashes as well. The items in the unlabeled COMMON block, alone, are communicated across chains (with the CHAIN statement).
COMMON statements may optionally declare the items “SHARED”, as in this example. These items will be available inside all subprograms in the module. (By default, variables are private to subprograms.) This form of the COMMON statement is often used solely to exploit the SHARED declaration method.
A special note regarding segmented architectures, especially that found on the Intel family of microprocessors.
Very large programs coded as a single module often will not compile because the code is too large. The problem can be solved. The Basmark QuickBASIC compiler implements each module as a single code segment although any number of code segments may be loaded together into an executable program. The solution is simply to rearrange the code into a series of smaller modules. The programmer need not be aware of the segmentation system to implement this solution.
The orientation toward GOTO statements in the QuickBASIC language has always distressed programming theorists. Their objections, for the most part, are reasonable— it being very difficult to grasp the meaning of code that seems to branch about arbitrarily. But while the theorists have regarded the programmers as crude and the programmers have regarded the theorists as high-falutin, the problem was probably never the GOTO statement at all but the limitations of the conditional statement. We do not have any reason to believe that college professors will start recommending QuickBASIC nor that programmers will stop using GOTO statements, but the limitations of the conditional statement have been lifted.
To illustrate, the usrquad subprogram example we were fond of may be rewritten in a more readable way:
sub usrquad(a, b, c, root1, root2) static static d d = fnd(a, b, c) if d < 0 then usrquad = 0 elseif d = 0 then root1 = fnx(a, b, 0.0) usrquad = 1 else d = sqr(d) root1 = fnx(a, b, d) root2 = fnx(a, b, -d) usrquad = 2 end if end sub
All block IF statements have an IF and an END IF statement but the ELSEIF and ELSE statements are optional. Any number of ELSEIF statements may be used. This makes it possible to create structures similar to the “case” statement found in many other languages.
In the example above the multi-line conditional makes the three-way nature of the code apparent. The consequences of each branch are easy to see.
Virtually all the places where line numbers may appear, alphanumeric labels are allowed. The usrquad subprogram can be rewritten using a label:
sub usrquad(a, b, c, root1, root2) static static d d = fnd(a, b, c) usrquad = 0 if d < 0 then goto done d = sqr(d) root1 = fnx(a, b, d) usrquad = 1 if d = 0 then goto done root2 = fnx(a, b, -d) usrquad = 2 done: end sub
The size an array needs to be is sometimes difficult or even impossible to known when a program is being written. Often we can make a reasonable guess and select a size we expect to be sufficiently huge. And yet Murphy’s Law guarantees that an application will come along that exceeds even the most generous estimate. The solution is a dynamically allocatable array— one whose size is fixed while the program is running.
For example suppose the length of a list and the list of data needs to be read from a file for subsequent processing:
open "file" for input as #1 input #1, n dim a(n) for i = 1 to n input #1, a(i) next i close #1
There are three ways to identify an array as dynamic; any one of which is sufficient.
The first is the appearance of the meta-command $DYNAMIC. Any subsequent explicit array dimensioning will be considered to be dynamic. This feature may be turned back off with the meta-command $STATIC.
The second method is to use a REDIM statement instead of a DIM statement. The two work similarly except that a REDIM statement guarantees the dimensioned array is dynamic.
The final method illustrated in the previous example is to specify a dimension that can not be evaluated until the program is executed, that is, with an expression which includes a variable (or a function call). If the array has several dimensions, only one need include a variable to make the array dynamic.
As previously mentioned, a DIM statement for a static array is not an executable statement; instead it is simply a declaration to the compiler and does not have any effect on the program when it is running. On the other hand, a DIM statement really does execute for a dynamic array. This difference is important because it effects the placement of COMMON statements which must appear after the dimensioning of static arrays but before any executable statements, such as the dimensioning of dynamic arrays. Thus the static versus dynamic status of an array must be reckoned with when coding COMMON statements.
Occasionally it is desirable to erase an array. But just as we have seen in the previous section, erasing means different things for a static versus dynamic array. For a static array, erasing sets all the elements of an array to zero (or the null string for a string array), just as the elements were when the program began running. Similarly, a dynamic array is returned to its original state meaning that the space for the array is deallocated.
For example in the previous section we made an array to hold values read from a file for later processing. Suppose that after processing we wish to free the space used by the array:
erase a
For a dynamic array, it is also possible to reuse an array with different dimensions using the DIM or REDIM statements although the two statements work slightly differently. The best way to illustrate the difference is to suppose that the REDIM statement first implicitly erases the array and that an array can not be dimensioned unless it is erased. Thus the REDIM statement can be used at any time to redimension an array but the DIM statement can only be used if the array has been explicitly erased (or has not been dimensioned yet).
Within a subprogram, it is not possible to determine whether an array passed as a parameter is static or dynamic. Thus it is not possible to erase or dimension an array passed as a parameter.
I/O with ordinary disk files is well documented elsewhere. Under the UNIX operating system, however, the concept of a file has a much broader meaning. A file under UNIX includes the pipe, an interprocess communication channel, and the terminal (a.k.a. teletypewriter or tty, for historical reasons). These devices need special treatment.
Suppose we have an array of names that we want sorted and any duplicate entries removed:
sub usrsortuniq(a$(1)) static i open "pipe: sort >temp" for output as #1 for i = 1 to ubound(a$) print #1, a$(i) next i close #1 open "pipe: uniq <temp" for input as #1 i = 0 while not eof(1) i = i + 1 line input #1, a$(i) wend close #1 kill "temp" usrsortuniq = i end sub
In this example we have very easily disposed of the stated problem by delegating it to the UNIX utilities designed for that purpose. We have shipped the data off to “sort” and stored the results in a temporary file. Then we read the data back in via “uniq” and destroyed the temporary file. Because any duplicates would cause the list to shrink, this subprogram returns the new length of the list.
Since we are usually interested in both the input and the output of a command, it would seem beneficial to create a pipe in both directions. In the example from the previous section, we might prefer to perform the sort and uniq as a single command — both writing to and reading from the pipe. There is very little to recommend this approach, however.
Because a pipe is merely a queue of characters (with finite length) between two processes, the creation of a pair of mutually dependent processes is asking for a deadlock: it is easy to create a situation where both processes are waiting for input from each other.
Consider the following program:
open "pipe: sort" as #1 for i = 1 to 10 print #1, a$(i) next i for i = 1 to 10 line input #1, a$(i) next i close #1
In this example, the data is written from the array to “sort” and then read back from “sort” to the array. Or is it? Unfortunately, this program is doomed. “Sort” will keep trying to read data until the pipe is closed — while we are trying to read the data back with LINE INPUT. The processes will deadlock.
We need to close the output side (from our program’s point of view) so that the command created by the OPEN will known there is no more data while leaving the input side open to read back the results.
We can do this with the EOF function. This function works in a special way for pipes. Firstly, there is really no such thing as an end-of-file on a pipe since the command thus created can send back data at any time. In reality the only sure test for end-of-file on a pipe is testing for the termination of the command. So end-of-file on a process really means termination of the command process. The EOF function on a pipe also has the special side effect of closing the output side of the pipe — the need for which has already been discussed.
Armed with this feature, suppose we would like to write a program similar to the example in the previous section except this time we would like to select from a list of phone numbers those in Northeastern Ohio:
open "pipe: fgrep '(216)'" as #1 for i = 1 to 1000 print #1, a$(i) next i while not eof(1) line input #1, a$ print a$ wend
The difficulty is that all the output is written out before any attempt is made to read. On an ordinary file this might seem perfectly reasonable. But a pipe is merely a queue with finite length. Suppose instead of “fgrep” our command was a filter paper containing ground coffee. Suppose we then began to output lots of hot water to the filter paper and intended to continue to do so until our supply of hot water was exhausted. No doubt we would have a mess of coffee on the table. Of course this program is not going to cause your data to spill on your desk but it will hang if we do not make provisions to receive the data before we finish sending it.
In order to read data from a pipe, we need to know if there is any available to be read lest we create a deadlock trying to read data that is not available.
We can find out how many characters are available on a pipe with the LOF function:
open "pipe: cat" as #1 for i = 1 to 1000 if lof(1) then line input #1, a$: print a$ print #1, a$ next i while not eof(1) line input #1, a$ print a$ wend
The program in the preceding section will deadlock because LOF is used only to test for a single character which is not sufficient to guarantee that an entire line of input is available.
If we were sure of a reasonable upper bound for line length, say 512, we could rewrite the code:
open "pipe: cat" as #1 for i = 1 to 1000 if lof(1) > 512 then line input #1, a$: print a$ print #1, a$ next i while not eof(1) line input #1, a$ print a$ wend
Have we slain the deadlock dragon yet? Perhaps, but we have other troubles. As previously discussed, the EOF function tests for the termination of the process. While it is not possible for the process to terminate while data remains on the pipe, it could happen that we have the misfortune of attempting our last LINE INPUT after the process has produced its last output but before the process terminates. The deadly sequence of events is: (1) the process sends its last output, (2) our LINE INPUT reads this last output, (3) our EOF sees the process is still running, (4) the process stops running, and, (5) our LINE INPUT attempts to read data that does not exist.
So our previous example will produce an error. The solution is to read data only after LOF verifies its existence:
open "pipe: cat" as #1 for i = 1 to 1000 if lof(1) > 512 then line input #1, a$: print a$ print #1, a$ next i while not eof(1) if lof(1) then line input #1, a$: print a$ wend
Empirical evidence is not sufficient to prove the appropriateness of such a program since the relative timing of two processes is entirely dependent on the scheduling of the operating system.
If at all possible, it is probably wisest to open the pipe for output only and send the process’s output to a temporary file; then open the file for input and read it in, similar to the example given at the beginning of this discussion of pipes.
Sometimes it is convenient to access another terminal or serial port. As with pipes, it is also possible to perform two-way communication:
open "/dev/tty11" as #1 input #1, "What year were you born"; a if a < 100 then a = a + 1900 if a - val(right$(date$, 4)) < 16 then print #1, "You can't drive in Ohio" end if close #1
This (admittedly contrived) example of two-way communication with a terminal tests the driving age of the user.
Suppose we would like to set up two-way communication with a modem for the purpose of dialing up another computer. We have no way of knowing when the other computer will make input available to us. Some other BASIC systems provide this information via the LOC function. Unfortunately, this facility is impractical on a multi-user system. Another method is used:
on error goto recovery open "/dev/tty11" as #1 len = 1 shell "stty 1200 </dev/tty11" reread: a$ = inkey$ while a$ <> "" print #1, a$; a$ = inkey$ wend while 1 a$ = input$(1, 1) print a$; wend recovery: resume reread
This program simply manages the modem for us. All input from the user terminal is output to the modem and vice versa. If there is no input from the modem, the INPUT$ function will time-out after one second and produce an error. The error recovery routine then attempts to read from the terminal. This program needs to be interrupted in order to terminate.
If the time-out period is not set with a LEN parameter, the program will wait forever — the programmer must determine the appropriate method. The sorts of deadlock problems outlined in the “Pipes” section are not possible with ordinary terminals, save waiting for input that never arrives which is straightforward. On the other hand, attached to a modem to a like-minded computer, this program can cause difficulties.
A number of powerful features are available in the Basmark brand of QuickBASIC. These features should make it easier to write better programs more quickly. Start writing programs!
from The Basmark QuickBASIC Programmer’s Manual by Lawrence Leinweber