The Standard ML Basis Library


The STREAM_IO signature

The STREAM_IO signature defines the interface of the Stream IO layer in the I/O stack. This layer provides buffering over the primitive readers and writers of the primitive IO layer.

Input streams are treated in the lazy functional style: that is, input from a stream f yields a finite vector of elements, plus a new stream f'. Input from f again will yield the same elements; to advance within the stream in the usual way it is necessary to do further input from f'. This interface allows arbitrary lookahead to be done very cleanly, which should be useful both for ad hoc lexical analysis and for table-driven, regular-expression-based lexing.

Output streams are handled more conventionally, since the lazy functional style doesn't seem to make sense for output.

Stream I/O functions may raise the Size exception if a resulting vector of elements would exceed the maximum vector size, or the IO.Io exception. In general, when IO.Io is raised as a result of a failure in a lower-level module, the underlying exception is propagated up as the cause component of the IO.Io exception value. This will usually be a Subscript, OS.SysErr or Fail exception, but the stream I/O module will rarely (perhaps never) need to inspect it.


Synopsis

signature STREAM_IO

Interface

type elem
type vector
type reader
type writer
type instream
type outstream
type in_pos
type out_pos
type pos
val input : instream -> (vector * instream)
val input1 : instream -> (elem * instream) option
val inputN : (instream * int) -> (vector * instream)
val inputAll : instream -> vector
val canInput : (instream * int) -> int option
val closeIn : instream -> unit
val endOfStream : instream -> bool
val mkInstream : (reader * vector) -> instream
val getReader : instream -> (reader * vector)
val getPosIn : instream -> in_pos
val setPosIn : in_pos -> instream
val filePosIn : in_pos -> pos
val output : (outstream * vector) -> unit
val output1 : (outstream * elem) -> unit
val flushOut : outstream -> unit
val closeOut : outstream -> unit
val setBufferMode : (outstream * IO.buffer_mode) -> unit
val getBufferMode : outstream -> IO.buffer_mode
val mkOutstream : (writer * IO.buffer_mode) -> outstream
val getWriter : outstream -> (writer * IO.buffer_mode)
val getPosOut : outstream -> out_pos
val setPosOut : out_pos -> outstream
val filePosOut : out_pos -> pos

Description

type elem
type vector
These are the abstract types of stream elements and vectors of elements. For text streams, these are Char.char and String.string, while for binary streams, these are Word8.word and Word8Vector.vector.

type reader
type writer
These are the types of the readers and writers that underlie the input and output streams.

type instream
These are buffered functional input streams.

type outstream
These are buffered output streams. Unlike input streams, these are imperative objects.

type in_pos
type out_pos
These are the abstract types of positions in input and output streams.

type pos
This is the type of positions in the underlying readers and writers.

input f
if elements are available, returns a vector of one or more elements from the stream and the remainder of the stream. If the end-of-stream has been reached, then the empty vector is returned. May block until one of these conditions is satisfied. This function raises the Io exception if there is an error in the underlying system calls.

input1 f
returns the next element in the stream f and the remainder of the stream. If the stream is at the end, then NONE is returned. May block until one of these conditions is satisfied. This function raises the Io exception if there is an error in the underlying system calls.

inputN (f, n)
returns a vector of the next n elements from f and the rest of the stream. If fewer than n elements are available, then it returns all of the elements up to the end-of-stream (the empty vector means that there is no more input). May block until it can determine if additional characters are available or the end-of-stream condition holds. This function raises the Io exception if there is an error in the underlying system calls. Raises Size if n < 0 or the number of elements to be returned is greater than maxLen.

Using instreams, one can synthesize a non-blocking version of inputN from inputN and canInput, as inputN is guaranteed not to block if a previous call to canInput returned SOME _.

inputAll f
returns the vector of the rest of the elements in the stream f (i.e., up to end-of-stream). Care should be taken when using this function, since it can block indefinitely on interactive streams. This function raises the Io exception if there is an error in the underlying system calls. Raises Size if the number of elements to be returned is greater than maxLen.

canInput (f, n)
returns NONE if any attempt at input would block. Returns SOME k, where 0 <= k <= n, if a call to input would return immediately with k characters. Note that k = 0 corresponds to the stream being at end-of-stream.

Some streams may not support this operation, in which case the Io exception will be raised. This function also raises the Io exception if there is an error in the underlying system calls. It raises the Size exception if n < 0.

Implementation note:

Implementations of canInput should attempt to return as large a k as possible. For example, if the buffer contains 10 characters and the user calls canInput (f, 15), canInput should call readVecNB 5 to see if an additional 5 characters are available.



closeIn f
truncates the instream f, and releases any associated system resources. Applying closeIn on a closed stream has no effect.

endOfStream f
tests if f satisfies the end-of-stream condition. If there is no further input in the stream, then this returns true; otherwise it returns false. This function raises the Io exception if there is an error in the underlying system calls.

This function may block when checking for more input. It is equivalent to

            (length(#1(input f)) = 0)
          
where length is the vector length operation

Note that even if this returns true, subsequent input operations may succeed if more data becomes available. We always have

            endOfStream f = endOfStream f
          
In addition, if endOfStream f returns true, then input f returns ("",f') and endOfStream f' may or may not be true.

mkInstream (rd, v)
returns a new instream built on top of the reader rd with initial buffer contents v.
Question:

We should explain the mapping between optional fields of the reader and supported operations (as a table?).

Note that building more than one instream on top of a single reader has unpredictable effects, since readers are imperative objects.

getReader f
truncates the instream f and returns the underlying reader along with any unconsumed data from its buffer. This raises the exception Io if f is closed or truncated.

getPosIn strm
returns the current position in the stream strm.

setPosIn pos
returns a stream based on the position and stream recorded in pos.

filePosIn pos
returns the primitive-level reader position that corresponds to the abstract input stream position pos.

output (f, vec)
writes the vector of elements vec to the stream f. This raises the exception Io if f is terminated. This function also raises the Io exception if there is an error in the underlying system calls.

output1 (f, elem)
writes the element elem to the stream f. This raises the exception Io if f is terminated. This function also raises the Io exception if there is an error in the underlying system calls.

flushOut f
flushes any output in the outstream's buffer to the underlying writer; it is a no-op on terminated streams. This function raises the Io exception if there is an error in the underlying system calls.

closeOut f
flushes f's buffers, marks the stream closed, and closes the underlying writer. This operation has no effect if f is already closed. If f is terminated, it should close the underlying writer. This function raises the Io exception if there is an error in the underlying system calls.

setBufferMode (ostr, mode)
getBufferMode ostr
set and get the buffering mode of the output stream ostr. Setting the buffer mode to IO.NO_BUF causes any buffered output to be flushed.

mkOutstream wr
returns a new outstream built on top of the writer wr.
Question:

We should explain the mapping between optional fields of the writer and supported operations (as a table?).

Note that building more than one outstream on top of a single writer has unpredictable effects, since buffering may change the order of output.

getWriter f
flushes and terminates the outstream f, and returns the underlying writer. This raises the exception Io if f is closed.

getPosOut strm
returns the current position out the stream strm.

setPosOut pos
sets the current position of the stream underlying pos to the position recorded in pos, and returns the stream.

filePosOut pos
returns the primitive-level writer position that corresponds to the abstract output stream position pos.


Discussion

The following expressions are all guaranteed true, if they complete without exception.

Input is semi-deterministic: input may read any number of elements from f the ``first'' time, but then it is committed to its choice, and must return the same number of elements on subsequent reads from the same point.

let val (a,_) = input f
    val (b,_) = input f
 in  a=b
end

Closing a stream just causes the not-yet-determined part of the stream to be empty:

let val (a,f') = input f
    val _ = closeIn f
    val (b,_) = input f
 in  a=b andalso endOfStream f'
end

Closing a terminated stream is legal and harmless:

  (closeIn f; closeIn f; true)

If a stream has already been at least partly determined, then input cannot possibly block:

let val (a,_) = input f
 in canInput (f, length a) 
end (* must be true *)
Note that a successful canInput does not imply that more characters remain before end-of-stream, just that reading won't block.

A freshly opened stream is still undetermined (no ``read'' has yet been done on the underlying reader):

let val a = mkInstream r
 in closeIn a;
    size(#1(input a)) = 0
end
This has the useful consequence that if one opens a stream, then extracts the underlying reader, the reader has not yet been advanced in its file.

Closing a stream guarantees that the underlying reader will never again be accessed; so input can't possibly block.

The endOfStream test is equivalent to input returning an empty sequence:

let val (a,_) = input f  
  in (length(a)=0) = (endOfStream f)   
end

Unbuffered I/O If chunkSize = 1 in the underlying reader, then input operations must be unbuffered:

let
val f = mkInstream(reader)
val (a,f') = input f
val PrimIO.Rd{chunkSize,...} = getReader f
in
  (chunkSize > 1) orelse endOfStream f'
nd
Although input may perform a read(k) operation on the reader (for k >= 1), it must immediately return all the elements it receives. However, this does not hold for partly determined instreams:
 let val f = mkInstream(reader)
     val _ = doInputOperationsOn(f)
     val (a,f') = input f
     val PrimIO.Rd{chunkSize,...} = getReader f
  in chunkSize>1 orelse endOfStream f'  (* could be false*)
 end
because in this case, the stream f may have accumulated a history of several responses, and input is required to repeat them one at a time.

Output buffering is controlled by the getBufferMode and setBufferMode functions.

Don't bother the reader Input must be done without any operation on the underlying reader, whenever it is possible to do so by using elements from the buffer. This is necessary so that repeated calls to endOfStream will not make repeated system calls.

See Also

PRIM_IO, IMPERATIVE_IO, TEXT_STREAM_IO, StreamIO

[ INDEX | TOP | Parent | Root ]

Last Modified May 10, 1996
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies