A QuickBASIC Tutorial

A number of features of the Basmark QuickBASIC compiler need describing.  Although the QuickBASIC Programmer’s Manual in general is designed for reference, some topics are not adequately treated by this format.  This document means to collect the various components in the several areas into coherent discussions for the purpose of instruction.  This document does not attempt to describe these topics exhaustively because the instructive qualities would suffer. 

The author would be foolhardy to attempt to describe the fundamentals of QuickBASIC here since the topic has been discussed more eloquently by others.  For a basic QuickBASIC tutorial, the reader may choose from the many available works.  Rather than compete with these works, this document focuses on the features of Basmark QuickBASIC which are not as thoroughly discussed in independent publications. 

With this in mind, a tutorial is presented in several topic areas: structured programming, dynamically allocatable arrays and input/output. 

1. STRUCTURED PROGRAMMING

For the sake of discussion, suppose we are interested in finding the roots of a quadratic equation, that is the values of x such that,

ax² + bx + c = 0
where a, b and c are known values.  From algebra we know that,

x = (–b ± sqrt(b² – 4ac)) ÷ 2a

1.1 User-defined Functions

One might code these solutions as:

print (-b + sqr(b * b - 4 * a * c)) / (a + a)
print (-b - sqr(b * b - 4 * a * c)) / (a + a)
But there is no sense writing the same code twice since it is tough enough to get a program working the first time.  Besides, this solution might be handy elsewhere.  So, it is better to define functions. 
def fnd(a, b, c) = b * b - 4 * a * c
def fnx(a, b, d) = (-b + d) / (a + a)
def fnroot1(a, b, c) = fnx(a, b,  sqr(fnd(a, b, c)))
def fnroot2(a, b, c) = fnx(a, b, -sqr(fnd(a, b, c)))
print fnroot1(a, b, c), fnroot2(a, b, c)
This is not so redundant but there is still a problem: the value of fnd could be negative which would cause an error in the square root function.  The error can be avoided:
def fnroot1(a, b, c)
	c = fnd(a, b, c)
	if c < 0 then exit def
	fnroot1 = fnx(a, b, sqr(c))
end def
Fnroot2 can be coded similarly.  This is a multi-line user-defined function.  The error is avoided with EXIT DEF which terminates execution of the function.  If the value of fnd is non-negative, the value of the function is set by assigning to fnroot1, the name of the function.  The multi-line function begins just like the single-line version except the “= expr” portion is eliminated.  The definition ends with END DEF. 

As with the single-line user-defined function, values are “passed by value,” that is, copies of the values passed from the calling routines are used.  Thus, the assignment to “c” in the first line in the example above does not effect the value of “c” in the PRINT statement.  Also, the parameters in the function definition are private to the function, so we could have assigned values to a, b and c within the function without fear of clobbering a, b or c outside the function. 

We are welcome to use other variables within a user-defined function but these are not private copies, so we must be careful not to clobber values needed outside the function. 

The previous example avoids an error with the square root function, but it does not report the problem in a meaningful way.  The preferable solution would be to report not only the roots, but also the number of roots.  A quadratic equation has zero, one or two roots depending on whether b² - 4ac is negative, zero or positive.  It would be most useful to have a function which returned the number of roots as its value and set the appropriate root values as well:

def fnd(a, b, c) = b * b - 4 * a * c
def fnx(a, b, d) = (-b + d) / (a + a)
def fnquad(a, b, c)
	c = fnd(a, b, c)
	fnquad = 0
	if c < 0 then exit def
	c = sqr(c)
	root1 = fnx(a, b, c)
	fnquad = 1
	if c = 0 then exit def
	root2 = fnx(a, b, -c)
	fnquad = 2
end def

n = fnquad(a, b, c)
if n > 0 then print root1
if n > 1 then print root2
This solution works nicely.  Notice that root1 and root2 are used to hold the roots.  These variables are not private (or “local”) to the function and while such variables must be used carefully to avoid clobbering values needed outside the function, here they are used intentionally — to return values from the function. 

1.2 Subprograms

Although the previous example is good, it suffers from the need for special variables which, inevitably, leads to confusion about the side effects of the function.  After all, if we were to interject code between the call of the function and the use of root1 and root2, it would become unclear that these variables were set by the function. 

We would like to be able to pass variables to the function whose values would be set upon return.  It would be relatively clear that the function had an effect on these variables and we would not be constrained to use particular names like root1 and root2.  Fortunately, subprograms have this characteristic:

sub quad(a, b, c, nroot, root1, root2) static
	static d

	d = fnd(a, b, c)
	nroot = 0
	if d < 0 then exit sub
	d = sqr(d)
	root1 = fnx(a, b, d)
	nroot = 1
	if d = 0 then exit sub
	root2 = fnx(a, b, -d)
	nroot = 2
end sub

call quad(a, b, c, n, r1, r2)
if n > 0 then print r1
if n > 1 then print r2
This example uses a subprogram.  Notice that values are assigned to parameter variables and are used subsequently in the calling code.  This technique would be ineffective with a user-defined function where parameters are passed by value, but in subprograms, the values are passed by address so the corresponding positional parameter variables in the calling code are modified.  For this reason, we have also introduced “d” (to hold the value of fnd) because reusing “c” would modify the corresponding parameter in the calling code. 

In addition, we have declared “d” in a STATIC statement so we need not be concerned about the “d” within the subprogram clobbering the value of “d” outside the subprogram.  Declaring a variable in a STATIC statement creates a private, local variable regardless of what appears outside the subprogram.  Actually, this declaration is usually redundant because without any declaration variables used within a subprogram are local variables, different from variables with the same name outside the subprogram.  Unfortunately, as we shall see later, this default can be overridden by code outside the subprogram, which makes us uncomfortable.  The reason we write subprograms in the first place is so our code will be safe from the hectic world outside.  So to be perfectly safe, the “subprogrammer” must use the STATIC statement.  Nevertheless, for a quick job we might omit the STATIC statement and get away with it. 

At other times, we might need a value from outside a subprogram.  As we now know, by default variables used within a subprogram are local to the subprogram.  We can override this default and gain access to a “global” variable with the SHARED statement, another declarative statement.  In the subprogram above, if we had declared SHARED instead of STATIC:

	shared d
then this “d” would refer to the same “d” outside the subprogram. 

The form of a subprogram is always like that given in this example.  The header has the word SUB followed by the subprogram’s name followed by the list of arguments (or parameters) in parentheses and finally the word STATIC.  If there are no arguments to the subprogram, the parentheses are also omitted.  The subprogram ends with END SUB, just as a multi-line user-defined function ends with END DEF.  Notice the use of EXIT SUB to terminate execution of the subprogram: it is analogous to the EXIT DEF in the multi-line user-defined function. 

The previous example is good but it would be better if the subprogram could return the number of roots as its value, as we did in our last example user-defined function, fnquad.  Unlike other structured QuickBASIC languages, Basmark QuickBASIC allows a value to be returned from a subprogram:

sub usrquad(a, b, c, root1, root2) static
	static d

	d = fnd(a, b, c)
	usrquad = 0
	if d < 0 then exit sub
	d = sqr(d)
	root1 = fnx(a, b, d)
	usrquad = 1
	if d = 0 then exit sub
	root2 = fnx(a, b, -d)
	usrquad = 2
end sub

n = usrquad(a, b, c, r1, r2)
if n > 0 then print r1
if n > 1 then print r2
Notice that this subprogram name begins with “usr”; this is a requirement of a subprogram that returns a value — just like the requirement that a user-defined function begins with “fn”.  Notice also the assignment to the subprogram name works the same as in a multi-line user-defined function. 

1.2.1 Using Subprograms

In the example above we passed simple variables to the subprogram.  Three of these (a, b and c) represented “input” parameters, variables used only for their values and two others (r1 and r2) represented “output” parameters, variables used as storage locations.  The difference is important and is determined only by convention and may vary from subprogram to subprogram.  Any expression may be used as a parameter to a subprogram but what happens when these parameters are misused and how do we avoid these problems?

Suppose our subprogram had been miscoded and set the value of “c”, altering the value of “c” in the calling code.  That would be a difficult bug to track down since we would tend to look for the bug in the calling code rather than the erroneous subprogram.  We can protect our input parameters with parentheses:

n = usrquad((a), (b), (c), r1, r2)
A parameter to a subprogram in parentheses is always copied to a safe place and the copy made available to the subprogram so the input parameters are safe.  Another benefit of this technique is that it creates a clear distinction between the input and output parameters. 

An output parameter from a subprogram ought to be a storage location, a simple variable or a reference to an array element.  Suppose our calling code had been miscoded and used an expression like “x+y” or “42.0”.  These are not storage locations at all.  Fortunately, expressions like these are also copied to a safe place and the copy made available to the subprogram.  So while there is little to recommend passing a constant as an output parameter location to a subprogram, there is no harm in it. 

1.2.2 Arrays and Subprograms

Suppose we have a job which requires array multiplication.  It would be most convenient to have a single subprogram to do the job and test that the array dimensions are appropriate:

if usrarrmul(c(), a(), b()) then print "Bad dimensions"
In fact, subprograms may take whole arrays as arguments — the calling code simply indicates that the variables are arrays by adding an empty pair of parentheses after the name as shown in the example above. 

The following example illustrates array multiplication:

sub usrarrmul(c(2), a(2), b(2)) static
	static i, j, k, sum

	usrarrmul = -1
	if ubound(a, 1) <> ubound(c, 1) then exit sub
	if ubound(a, 2) <> ubound(b, 1) then exit sub
	if ubound(b, 2) <> ubound(c, 2) then exit sub
	usrarrmul = 0
	for i = 1 to ubound(c, 1)
		for j = 1 to ubound(c, 2)
			sum = 0
			for k = 1 to ubound(a, 2)
				sum = sum + a(i, k) * b(k, j)
			next k
			c(i, j) = sum
		next j
	next i
end sub
The declaration of arguments in this subprogram are interesting.  The parentheses after the argument name indicate that it is an array.  A single constant within the parentheses is required: it indicates the number of dimensions in the array. 

A new function UBOUND is illustrated here; it returns the upper bound of the array given by the first argument along the dimension given by the second argument.  For single dimension arrays, the second argument may be omitted.  With this function, a subprogram may be written which is not dependent on a particular array size and without the need for special arguments to indicate size. 

1.2.3 Speaking in Tongues

For those familiar with other programming languages, particularly C, it is often convenient to write specialized code in other languages.  With Basmark QuickBASIC, it is possible to interface QuickBASIC with other languages.  Some languages, such as FORTRAN may require an intermediary routine written in C or assembler, but these latter two languages can always be interfaced directly with QuickBASIC.  In fact, Basmark QuickBASIC subprograms are defined in terms of the C programming language so truly portable bilingual programs can be written in these languages. 

For example, the usrquad subprogram described earlier could just as easily be written in C:

#define FND(a, b, c)	((b) * (b) - 4 * (a) * (c))
#define FNX(a, b, d)	((-(b) + (d)) / ((a) + (a)))

float FLTquad(a, b, c, root1, root2)
float *a, *b, *c, *root1, *root2;
{
	double d, sqrt();

	d = FND(*a, *b, *c);
	if (d < 0.0) return (0.0);
	d = sqrt(d);
	*root1 = FNX(*a, *b, d);
	if (d == 0.0) return (1.0);
	*root2 = FNX(*a, *b, -d);
	return (2.0);
}
Note that user-defined functions are not available to routines in other languages (or any other module); we have chosen to implement that code as macros. 

1.3 Multiple Modules

For large programming projects, the size of the program can become a bit unruly: the program takes longer to load in editors, the compilation time becomes unreasonably long (even if much of the code has not been altered since the last compilation) and the program generally suffers from a lack of organization.  To cope with these difficulties, we can organize the code into a series of coherent files (or modules).  The modules are compiled separately so only those modules which have been modified need re-compilation.  Finally a loader is used to produce an executable program from all the compiled modules.  This scheme is well-supported by the UNIX system and the Basmark QuickBASIC compiler so the programmer need only a minimal awareness of the process— indeed the process is completely invisible in the single module scenario. 

In any case, one module is designated the “main” module and we shall call the others “secondary” modules.  The main module is the module where execution begins.  The secondary modules, if any, can only be executed as a result of subprogram calls from other modules. 

Suppose we have a main module and two secondary modules:

compute

 rem $main        

 call indata
 call compute
 call outdata


 sub compute
     .
     .
     .
 end sub


input

 '$include:'rep.i'

 sub indata
 on error goto 100
     .
     .
     .
 end sub


 100 call report
     resume next


output

 '$include:'rep.i'

 sub outdata
     .
     .
     .
 end sub

 sub report
     .
     .
     .
 end sub


The main module, named “compute”, calls subprograms in its own module and
other modules.   Subprogram calls are the only method by which code in other
modules can be executed and here the method is used quite effectively.   In fact,
some programming theorists would be quite pleased with this structure. 

The QuickBASIC compiler must be informed which module is the main module.  The methods of doing this are three.  The simplest is to list the main module first on the command line:

basic compute.b input.b output.b
The compiler will assume that the first module named, compute, is the main module. 

Of course, specifying all the modules on the command line does not allow for separate compilation, one of the principal advantages of multiple modules.  We would prefer to compile the modules separately and then load them together:

basic -c compute.b
basic -c -M input.b
basic -c -M output.b
basic compute.o input.o output.o
The “-c” switch indicates compile but do not load.  The last line of this example loads the object modules (the resultant files of the compilations) into a single executable.  The second method of identifying the main module is illustrated above.  Notice that the “-M” is used to indicate the modules that are not main modules (even though they are the first modules specified on the command line).  With this example we begin to see the advantages of separate compilation.  If, for example, we had run the program and decided to make changes only in, say, the output module, then we could confidently omit the first two commands in this example in recompiling this program which would save us some time waiting for the computer. 

The final method for informing the QuickBASIC compiler of the main module is illustrated in the diagram above.  It is often preferable, and more straightforward, to use the same command line for all modules (rather than fool with special switches) and leave the identification of the main module to the code of the main module itself.  A special meta-command, $MAIN, may be placed at the beginning of the main module, as shown in the diagram above.  This meta-command overrides the meaning of the “-M” switch so that the “-M” switch can appear on the command line for all modules. 

This method, placing the $MAIN meta-command at the beginning of the main module, is especially convenient in conjunction with “make”, the program maintenance program.  This UNIX utility uses file modification dates and a file, named “Makefile”, of user supplied dependency information to direct compilation of a multi-module program.  “Make” even has implicit rules for compiling modules for standard UNIX languages like C and FORTRAN.  This utility is just what the doctor ordered to assist programmers in avoiding redundant compilation of unmodified modules and yet ensuring the production of an up to date executable program.  So what more could you ask for? Well, an implicit rule for compiling QuickBASIC modules, of course:

.SUFFIXES: .b
.b.o:	; basic -c -M $<
With this rule in your “Makefile”, QuickBASIC modules can be automatically found and compiled into object modules.  The “Makefile” for our example could be completed as follows:
OBJS=compute.o input.o output.o
a.out: $(OBJS)
	basic $(OBJS)
input.o output.o: rep.i
Notice that the last line even identifies the modules in the diagram above that use the include file, rep.i.  If the include file is modified, these modules, properly, will be recompiled. 

1.3.1 Error Recovery and the Data Area

In multi-module programs, each module has its own private error recovery routine and data area (DATA statements). 

A unique property of the error recovery routine, even in secondary modules, is that this routine is not contained in a subprogram.  Instead the error recovery routine lies outside of all subprograms.  The “ON ERROR GOTO x” statement, of course, may be inside a subprogram.  This allows each module to provide a routine to recover from errors that may occur in the module. 

In the diagram in the preceding section, the input module contains a routine at line 100 for reporting errors in the input.  However, if an error occurs in the compute or output module, such an error is not trapped and execution does not proceed with the input module’s error recovery routine.  The other modules could, of course, have their own private error recovery routines. 

1.3.2 COMMON Data

In addition to the passing of parameters to subprograms, data may be communicated between modules with labeled and unlabeled COMMON blocks.  Because the order of the items in a COMMON block is crucial, it is standard practice to place COMMON statements in an include file to ensure consistency.  In the diagram in the preceding section, suppose “rep.i” contains the declaration of a COMMON block for reporting errors:

COMMON SHARED /report/ errnum, errname$
These items, errnum and errname$, are available to all modules that include this declaration.  The COMMON block name, “report”, is used to provide privacy; by convention other communities of modules will use other COMMON block names. 

A unique COMMON block is the unlabeled COMMON block, which has no name.  Declarations for this COMMON block omit the slashes as well.  The items in the unlabeled COMMON block, alone, are communicated across chains (with the CHAIN statement). 

COMMON statements may optionally declare the items “SHARED”, as in this example.  These items will be available inside all subprograms in the module.  (By default, variables are private to subprograms.) This form of the COMMON statement is often used solely to exploit the SHARED declaration method. 

1.3.3 Segmented Architectures

A special note regarding segmented architectures, especially that found on the Intel family of microprocessors. 

Very large programs coded as a single module often will not compile because the code is too large.  The problem can be solved.  The Basmark QuickBASIC compiler implements each module as a single code segment although any number of code segments may be loaded together into an executable program.  The solution is simply to rearrange the code into a series of smaller modules.  The programmer need not be aware of the segmentation system to implement this solution. 

1.4 Multi-line Conditionals

The orientation toward GOTO statements in the QuickBASIC language has always distressed programming theorists.  Their objections, for the most part, are reasonable— it being very difficult to grasp the meaning of code that seems to branch about arbitrarily.  But while the theorists have regarded the programmers as crude and the programmers have regarded the theorists as high-falutin, the problem was probably never the GOTO statement at all but the limitations of the conditional statement.  We do not have any reason to believe that college professors will start recommending QuickBASIC nor that programmers will stop using GOTO statements, but the limitations of the conditional statement have been lifted. 

To illustrate, the usrquad subprogram example we were fond of may be rewritten in a more readable way:

sub usrquad(a, b, c, root1, root2) static
	static d

	d = fnd(a, b, c)
	if d < 0 then
		usrquad = 0
	elseif d = 0 then
		root1 = fnx(a, b, 0.0)
		usrquad = 1
	else d = sqr(d)
		root1 = fnx(a, b,  d)
		root2 = fnx(a, b, -d)
		usrquad = 2
	end if
end sub
This example illustrates all the elements of the multi-line conditional, or “block” IF statement.  It is important to note that there are restrictions on where these statements can appear on a line.  A block IF, ELSEIF, ELSE or END IF statement must be the first on the line.  Also, the block IF statement can not have a statement after it on the line.  Other kinds of statements can be put together (separated by colons) on a single line — but these statements have restrictions. 

All block IF statements have an IF and an END IF statement but the ELSEIF and ELSE statements are optional.  Any number of ELSEIF statements may be used.  This makes it possible to create structures similar to the “case” statement found in many other languages. 

In the example above the multi-line conditional makes the three-way nature of the code apparent.  The consequences of each branch are easy to see. 

1.5 Labels

Virtually all the places where line numbers may appear, alphanumeric labels are allowed.  The usrquad subprogram can be rewritten using a label:

sub usrquad(a, b, c, root1, root2) static
	static d

	d = fnd(a, b, c)
	usrquad = 0
	if d < 0 then goto done
	d = sqr(d)
	root1 = fnx(a, b, d)
	usrquad = 1
	if d = 0 then goto done
	root2 = fnx(a, b, -d)
	usrquad = 2
done:
end sub
In the example, the label “done” is used as the location where the subprogram is done.  The label is used just like a line number except that the location is marked followed by a colon.  As previously mentioned a label is not allowed everywhere that a line number is allowed so while we might be tempted to omit the word GOTO in the example above, we can not because one of the places a label is not allowed is immediately after the word THEN.  A label always begins with a letter. 

2. DYNAMICALLY-ALLOCATABLE ARRAYS

The size an array needs to be is sometimes difficult or even impossible to known when a program is being written.  Often we can make a reasonable guess and select a size we expect to be sufficiently huge.  And yet Murphy’s Law guarantees that an application will come along that exceeds even the most generous estimate.  The solution is a dynamically allocatable array— one whose size is fixed while the program is running. 

For example suppose the length of a list and the list of data needs to be read from a file for subsequent processing:

open "file" for input as #1
input #1, n
dim a(n)
for i = 1 to n
	input #1, a(i)
next i
close #1
The array “a” is allocated dynamically.  The size of the array is determined only when the DIM statement is executed, unlike a static array for which the DIM statement is not executable. 

There are three ways to identify an array as dynamic; any one of which is sufficient. 

The first is the appearance of the meta-command $DYNAMIC.  Any subsequent explicit array dimensioning will be considered to be dynamic.  This feature may be turned back off with the meta-command $STATIC. 

The second method is to use a REDIM statement instead of a DIM statement.  The two work similarly except that a REDIM statement guarantees the dimensioned array is dynamic. 

The final method illustrated in the previous example is to specify a dimension that can not be evaluated until the program is executed, that is, with an expression which includes a variable (or a function call).  If the array has several dimensions, only one need include a variable to make the array dynamic. 

As previously mentioned, a DIM statement for a static array is not an executable statement; instead it is simply a declaration to the compiler and does not have any effect on the program when it is running.  On the other hand, a DIM statement really does execute for a dynamic array.  This difference is important because it effects the placement of COMMON statements which must appear after the dimensioning of static arrays but before any executable statements, such as the dimensioning of dynamic arrays.  Thus the static versus dynamic status of an array must be reckoned with when coding COMMON statements. 

2.1 Erasing Arrays

Occasionally it is desirable to erase an array.  But just as we have seen in the previous section, erasing means different things for a static versus dynamic array.  For a static array, erasing sets all the elements of an array to zero (or the null string for a string array), just as the elements were when the program began running.  Similarly, a dynamic array is returned to its original state meaning that the space for the array is deallocated. 

For example in the previous section we made an array to hold values read from a file for later processing.  Suppose that after processing we wish to free the space used by the array:

erase a
Notice that no parentheses or subscripts are used in this reference to an array in an ERASE statement. 

2.2 Redimensioning Arrays

For a dynamic array, it is also possible to reuse an array with different dimensions using the DIM or REDIM statements although the two statements work slightly differently.  The best way to illustrate the difference is to suppose that the REDIM statement first implicitly erases the array and that an array can not be dimensioned unless it is erased.  Thus the REDIM statement can be used at any time to redimension an array but the DIM statement can only be used if the array has been explicitly erased (or has not been dimensioned yet). 

Within a subprogram, it is not possible to determine whether an array passed as a parameter is static or dynamic.  Thus it is not possible to erase or dimension an array passed as a parameter. 

3. INPUT/OUTPUT

I/O with ordinary disk files is well documented elsewhere.  Under the UNIX operating system, however, the concept of a file has a much broader meaning.  A file under UNIX includes the pipe, an interprocess communication channel, and the terminal (a.k.a. teletypewriter or tty, for historical reasons).  These devices need special treatment. 

3.1 Pipes

Suppose we have an array of names that we want sorted and any duplicate entries removed:

sub usrsortuniq(a$(1))
	static i

	open "pipe: sort >temp" for output as #1
	for i = 1 to ubound(a$)
		print #1, a$(i)
	next i
	close #1
	open "pipe: uniq <temp" for input as #1
	i = 0
	while not eof(1)
		i = i + 1
		line input #1, a$(i)
	wend
	close #1
	kill "temp"
	usrsortuniq = i
end sub
This subprogram contains two examples of the OPEN statement with a pipe.  Notice that the filename in the OPEN statements has unique form: “pipe: command”.  The command is a single line command to the UNIX shell, just as it might be typed at the terminal.  When open for output, this type of “file” causes the output (via statements like PRINT) to be sent directly to the standard input of the UNIX command.  Conversely, when open for input, this type of “file” causes the input (via statements like LINE INPUT) to be received directly from the standard output of the UNIX command. 

In this example we have very easily disposed of the stated problem by delegating it to the UNIX utilities designed for that purpose.  We have shipped the data off to “sort” and stored the results in a temporary file.  Then we read the data back in via “uniq” and destroyed the temporary file.  Because any duplicates would cause the list to shrink, this subprogram returns the new length of the list. 

3.1.1 Two-way Pipes and Deadlock

Since we are usually interested in both the input and the output of a command, it would seem beneficial to create a pipe in both directions.  In the example from the previous section, we might prefer to perform the sort and uniq as a single command — both writing to and reading from the pipe.  There is very little to recommend this approach, however. 

Because a pipe is merely a queue of characters (with finite length) between two processes, the creation of a pair of mutually dependent processes is asking for a deadlock: it is easy to create a situation where both processes are waiting for input from each other. 

Consider the following program:

open "pipe: sort" as #1
for i = 1 to 10
	print #1, a$(i)
next i
for i = 1 to 10
	line input #1, a$(i)
next i
close #1
The OPEN statement in this example is similar to that used for random I/O files, but because the filename refers to a pipe the statement has a special meaning: it creates a two-way pipe.  The “file” can be used for both input and output. 

In this example, the data is written from the array to “sort” and then read back from “sort” to the array.  Or is it? Unfortunately, this program is doomed.  “Sort” will keep trying to read data until the pipe is closed — while we are trying to read the data back with LINE INPUT.  The processes will deadlock. 

3.1.2 The EOF Function for Pipes

We need to close the output side (from our program’s point of view) so that the command created by the OPEN will known there is no more data while leaving the input side open to read back the results. 

We can do this with the EOF function.  This function works in a special way for pipes.  Firstly, there is really no such thing as an end-of-file on a pipe since the command thus created can send back data at any time.  In reality the only sure test for end-of-file on a pipe is testing for the termination of the command.  So end-of-file on a process really means termination of the command process.  The EOF function on a pipe also has the special side effect of closing the output side of the pipe — the need for which has already been discussed. 

Armed with this feature, suppose we would like to write a program similar to the example in the previous section except this time we would like to select from a list of phone numbers those in Northeastern Ohio:

open "pipe: fgrep '(216)'" as #1
for i = 1 to 1000
	print #1, a$(i)
next i
while not eof(1)
	line input #1, a$
	print a$
wend
Here the EOF function neatly ensures that the command, “fgrep”, expects no more input before we attempt to do our own inputting.  This program will not deadlock as would the example in the previous section.  Or will it? Unfortunately, this program may well deadlock for another reason. 

The difficulty is that all the output is written out before any attempt is made to read.  On an ordinary file this might seem perfectly reasonable.  But a pipe is merely a queue with finite length.  Suppose instead of “fgrep” our command was a filter paper containing ground coffee.  Suppose we then began to output lots of hot water to the filter paper and intended to continue to do so until our supply of hot water was exhausted.  No doubt we would have a mess of coffee on the table.  Of course this program is not going to cause your data to spill on your desk but it will hang if we do not make provisions to receive the data before we finish sending it. 

3.1.3 The LOF Function with Pipes

In order to read data from a pipe, we need to know if there is any available to be read lest we create a deadlock trying to read data that is not available. 

We can find out how many characters are available on a pipe with the LOF function:

open "pipe: cat" as #1
for i = 1 to 1000
	if lof(1) then line input #1, a$: print a$
	print #1, a$
next i
while not eof(1)
	line input #1, a$
	print a$
wend
In this example, the value of LOF is simply used to determine if any characters are available on the pipe.  This program will not deadlock because too much data is written without a read, but it may still deadlock. 

3.1.4 More Deadlock Problems with Two-way Pipes

The program in the preceding section will deadlock because LOF is used only to test for a single character which is not sufficient to guarantee that an entire line of input is available. 

If we were sure of a reasonable upper bound for line length, say 512, we could rewrite the code:

open "pipe: cat" as #1
for i = 1 to 1000
	if lof(1) > 512 then line input #1, a$: print a$
	print #1, a$
next i
while not eof(1)
	line input #1, a$
	print a$
wend
Here we are only reading when we know a full line is available.  Of course if we could potentially have very long lines, a single line might be sufficient to “clog” the pipe and we would have to consider writing and reading smaller units than lines (which might be very painful). 

Have we slain the deadlock dragon yet? Perhaps, but we have other troubles.  As previously discussed, the EOF function tests for the termination of the process.  While it is not possible for the process to terminate while data remains on the pipe, it could happen that we have the misfortune of attempting our last LINE INPUT after the process has produced its last output but before the process terminates.  The deadly sequence of events is: (1) the process sends its last output, (2) our LINE INPUT reads this last output, (3) our EOF sees the process is still running, (4) the process stops running, and, (5) our LINE INPUT attempts to read data that does not exist. 

So our previous example will produce an error.  The solution is to read data only after LOF verifies its existence:

open "pipe: cat" as #1
for i = 1 to 1000
	if lof(1) > 512 then line input #1, a$: print a$
	print #1, a$
next i
while not eof(1)
	if lof(1) then line input #1, a$: print a$
wend
What is the fatal flaw of this last example? We do not know but considering the subtlety of the issues involved and the difficulty in proving the appropriateness of this type of program, we would not be surprised to find this program deadlocked. 

Empirical evidence is not sufficient to prove the appropriateness of such a program since the relative timing of two processes is entirely dependent on the scheduling of the operating system. 

If at all possible, it is probably wisest to open the pipe for output only and send the process’s output to a temporary file; then open the file for input and read it in, similar to the example given at the beginning of this discussion of pipes. 

3.2 Ttys

Sometimes it is convenient to access another terminal or serial port.  As with pipes, it is also possible to perform two-way communication:

open "/dev/tty11" as #1
input #1, "What year were you born"; a
if a < 100 then a = a + 1900
if a - val(right$(date$, 4)) < 16 then
	print #1, "You can't drive in Ohio"
end if
close #1
Here the terminal “/dev/tty11” is opened with syntax normally used for random I/O files.  For tty’s this implies two-way communication.  Note that the filenames for terminals vary from system to system.  Also, it is necessary to ensure that the terminal is not logged in and that UNIX is not expecting anyone to log in and that write permission is obtained before executing this type of program. 

This (admittedly contrived) example of two-way communication with a terminal tests the driving age of the user. 

3.2.1 Time-outs on Terminals

Suppose we would like to set up two-way communication with a modem for the purpose of dialing up another computer.  We have no way of knowing when the other computer will make input available to us.  Some other BASIC systems provide this information via the LOC function.  Unfortunately, this facility is impractical on a multi-user system.  Another method is used:

	on error goto recovery
	open "/dev/tty11" as #1 len = 1
	shell "stty 1200 </dev/tty11"
reread:	a$ = inkey$
	while a$ <> ""
		print #1, a$;
		a$ = inkey$
	wend
	while 1
		a$ = input$(1, 1)
		print a$;
	wend
recovery: resume reread
This example uses the optional LEN parameter in the OPEN statement to set a limit on the time the program will wait for input from the terminal.  Here the limit is set to one second of real time.  Notice also that the SHELL statement is used to execute the UNIX utility “stty” to set the baud rate of 1200 for our modem. 

This program simply manages the modem for us.  All input from the user terminal is output to the modem and vice versa.  If there is no input from the modem, the INPUT$ function will time-out after one second and produce an error.  The error recovery routine then attempts to read from the terminal.  This program needs to be interrupted in order to terminate. 

If the time-out period is not set with a LEN parameter, the program will wait forever — the programmer must determine the appropriate method.  The sorts of deadlock problems outlined in the “Pipes” section are not possible with ordinary terminals, save waiting for input that never arrives which is straightforward.  On the other hand, attached to a modem to a like-minded computer, this program can cause difficulties. 

4. CONCLUSION

A number of powerful features are available in the Basmark brand of QuickBASIC.  These features should make it easier to write better programs more quickly.  Start writing programs!

from The Basmark QuickBASIC Programmer’s Manual by Lawrence Leinweber