The Standard ML Basis Library


Introduction

This document describes the Standard ML Basis Library. This library provides an extensive initial basis for Standard ML, which complements the language described by the Definition of Standard ML. The goals of the Basis Library are to:

In this chapter, we discuss the principles and conventions used in the design of the Library, and present a high-level view of the library structure.

By design, the Basis Library is meant to provide a fairly rich collection of general-purpose modules that can serve as the basis for applications programming or for more domain-specific libraries. One criterion for inclusion in the Basis Library is that a type or value requires compiler or run-time system support. In addition, the Library defines a standard minimal environment that anyone using SML can expect to find. The Library also attempts to provide similar functions in similar contexts. Thus, the traditional app function for lists, which applies a function to each member of a list, has also been provided for arrays and vectors.

An opposite design force has been the desire to keep the basis library small. In general, a function has been included only if it has clear or proven utility, with additional emphasis on those that are complicated to implement, require compiler support, or are more concise or efficient than an equivalent combination of other functions. Some exceptions were made for historical reasons.

Library modules

The Basis Library is contained in a set of structures. Almost every type, exception constructor and value belongs to some structure. Although some identifiers are also bound in the initial top-level environment we have attempted to keep the number of top-level identifiers small. Infix declarations and overloading are specified for the top-level environment.

We view the signature and structure names used below as being reserved. For an implementation to be conforming, any module it provides that is named in the SML Basis Library must exactly match the description specified in the library. For example, the Int structure provided by an implementation should not match a superset of the INTEGER signature. If an implementation provides any types, values or modules not described in the SML Basis Library, they must be encapsulated in additional structures whose names are not used by the SML Basis Library. In particular, an implementation must not introduce any new non-module identifiers into the top-level environment.

Required modules

We have divided the modules into required and optional categories. Any conforming implementation of SML Basis Library must provide implementations of all of the required modules.

Many of the structures are variations on some generic module (e.g., single and double-precision floating-point numbers). The following table gives a list of the required generic signatures.


Signature Description
CHAR Generic character interface
INTEGER Generic integer interface
MATH Generic math library interface
IMPERATIVE_IO Imperative I/O interface
MONO_ARRAY Mutable monomorphic arrays
MONO_VECTOR Immutable monomorphic vectors
PRIM_IO System-call operations for IO
REAL Generic real number interface
STREAM_IO Stream I/O interface
STRING Generic string interface
SUBSTRING Generic substring interface
TEXT_IO Text I/O interface
TEXT_STREAM_IO Text stream I/O interface
WORD Generic word (i.e., unsigned modular integer) interface

Non-generic signatures typically define the interface of a unique structure. A list of the required non-generic signatures is given below.
Signature Description
ARRAY Mutable polymorphic arrays
BIN_IO Binary input/output types and operations
BOOL Boolean type and values
BYTE Conversions between Word8 and Char values
COMMAND_LINE Program name and arguments
DATE Calendar operations
GENERAL General-purpose types, exceptions and values
IEEE_REAL Floating-point classes and hardware control
IO Basic I/O types and exceptions
LIST List type and utility functions
LIST_PAIR List of pairs and utility functions
OPTION Optional values and partial functions
OS Basic operating system services
OS_FILE_SYS File status and directory operations
OS_IO Support for polling I/O devices
OS_PATH Pathname operations
OS_PROCESS Simple process operations
SML90 Structure for backward compatability
STRING_CVT Support for conversions between strings and values
TIME Representation of time values
TIMER Timing operations
VECTOR Immutable polymorphic arrays

The required structures (and their signatures) are listed next.
Structure Signature Description
Array ARRAY Mutable polymorphic arrays
BinIO BIN_IO Binary input/output types and operations
BinPrimIO PRIM_IO Low-level binary IO
Bool BOOL Boolean type and values
Byte BYTE Conversions between Word8 and Char values
Char CHAR Default characters
CharArray MONO_ARRAY Mutable arrays of characters
CharVector MONO_VECTOR Immutable arrays of characters
CommandLine COMMAND_LINE Program name and arguments
Date DATE Calendar operations
General GENERAL General-purpose types, exceptions and values
IEEEReal IEEE_REAL Floating-point classes and hardware control
Int INTEGER Default integer type
IO IO Basic I/O types and exceptions
LargeInt INTEGER Largest integer representation
LargeReal REAL Largest floating-point representation
LargeWord WORD Largest word representation
List LIST List type and utility functions
ListPair LIST_PAIR List of pairs and utility functions
Math MATH Default math structure
Option OPTION Optional values and partial functions
OS OS Basic operating system services
OS.FileSys OS_FILE_SYS File status and directory operations
OS.IO OS_IO Support for polling I/O devices
OS.Path OS_PATH Pathname operations
OS.Process OS_PROCESS Simple process operations
Position INTEGER File system positions
Real REAL Default floating-point type
SML90 SML90 Structure for backward compatability
String STRING Default strings
StringCvt STRING_CVT Conversions between strings and various types
Substring SUBSTRING Substrings
TextIO TEXT_IO Text input/output types and operations
TextPrimIO PRIM_IO Low-level text IO
Time TIME Representation of time values
Timer TIMER Timing operations
Vector VECTOR Immutable polymorphic vectors
Word WORD Default word type
Word8 WORD 8-bit words
Word8Array MONO_ARRAY Arrays of 8-bit words
Word8Vector MONO_VECTOR Vectors of 8-bit words

Optional modules

The library specifies a large collection of signatures and structures that are considered optional in a conforming implementation. They provide features that, although useful, are not considered fundamental to a workable SML implementation. These modules include additional representations of integers, words, characters and reals; more efficient array and vector representations; and a subsystem providing Posix compatability.

Although an implementation may or may not provide one of these modules, if it provides one, the module must exactly match the specification given in this document. The names specified here for optional signatures and structures must be used at top-level only to denote implementations of the specified library module. On the other hand, if an implementation offers features related to an optional module, it should also provide the optional module.

The library specifies the following optional signatures.


Signature Description
ARRAY2 Mutable polymorphic 2-dimensional arrays
INT_INF Arbitrary-precision integers
LOCALE Support for locale-dependent applications
MONO_ARRAY2 Mutable monomorphic 2-dimensional arrays
MULTIBYTE Support for multibyte characters
PACK_REAL Support for packing floats into vectors of 8-bit words
PACK_WORD Support for packing words into vectors of 8-bit words
POSIX Root POSIX structure
POSIX_ERROR POSIX error values
POSIX_FILE_SYS POSIX file system operations
POSIX_FLAGS Support for sets of system flags
POSIX_IO POSIX I/O operations
POSIX_PROC_ENV POSIX process environment operations
POSIX_PROCESS POSIX process operations
POSIX_SIGNAL POSIX signal types and values
POSIX_SYS_DB POSIX system database types and values
POSIX_TTY Control of POSIX TTY drivers
UNIX Various Unix specific operations

The following table gives the set of optional structures.
Structure Signature Description
Array2 ARRAY2 Mutable polymorphic 2-dimensional arrays
BoolArray MONO_ARRAY Mutable arrays of booleans
BoolArray2 MONO_ARRAY2 2-dimensional arrays of booleans
BoolVector MONO_VECTOR Immutable arrays of booleans
CharArray2 MONO_ARRAY2 2-dimensional arrays of characters
FixedInt INTEGER Largest fixed precision integers
ImperativeIO IMPERATIVE_IO Functor to convert stream I/O into imperative IO
IntInf INT_INF Arbitrary-precision integers
IntN INTEGER N-bit, fixed precision integers
IntArray MONO_ARRAY Mutable arrays of default integers
IntNArray MONO_ARRAY Mutable arrays of N-bit integers
IntArray2 MONO_ARRAY2 2-dimensional arrays of integers
IntNArray2 MONO_ARRAY2 2-dimensional arrays of N-bit integers
IntVector MONO_VECTOR Immutable vectors of default integers
IntNVector MONO_VECTOR Immutable vectors of N-bit integers
Locale LOCALE Support for locale-dependent applications
MultiByte MULTIBYTE Support for multibyte characters
PackRealNBig PACK_REAL Big-endian packing for N-bit floats
PackRealNLittle PACK_REAL Little-endian packing for N-bit floats
PackRealBig PACK_REAL Big-endian packing for default floats
PackRealLittle PACK_REAL Little-endian packing for default floats
PackNBig PACK_WORD Big-endian packing for N-byte words
PackNLittle PACK_WORD Little-endian packing for N-byte words
Posix POSIX Root POSIX structure
Posix.Error POSIX_ERROR POSIX error values
Posix.FileSys POSIX_FILE_SYS POSIX file system operations
Posix.IO POSIX_IO POSIX I/O operations
Posix.ProcEnv POSIX_PROC_ENV POSIX process environment operations
Posix.Process POSIX_PROCESS POSIX process operations
Posix.Signal POSIX_SIGNAL POSIX signal types and values
Posix.SysDB POSIX_SYS_DB POSIX system database types and values
Posix.TTY POSIX_TTY Control of POSIX TTY drivers
PrimIO PRIM_IO Functor to build PRIM_IO structure
RealArray MONO_ARRAY Mutable arrays for default floats
RealVector MONO_VECTOR Immutable vectors for default floats
RealN REAL N-bit floating-point numbers
RealNArray MONO_ARRAY Mutable arrays of N-bit floating-point numbers
RealNVector MONO_VECTOR Immutable vectors of N-bit floating-point numbers
RealArray2 MONO_ARRAY2 2-dimensional arrays of floating-point numbers
RealNArray2 MONO_ARRAY2 2-dimensional arrays of N-bit floating-point numbers
StreamIO STREAM_IO Functor to convert primitive I/O into stream I/O
SysWord WORD Words sufficient for OS operations
WideChar CHAR Support for wide characters
WideCharArray MONO_ARRAY Mutable arrays of wide characters
WideCharArray2 MONO_ARRAY2 2-dimensional arrays of wide characters
WideCharVector MONO_VECTOR Immutable vectors of wide characters
WideString STRING Support for wide strings
WideSubstring SUBSTRING Support for wide substrings
WideTextPrimIO PRIM_IO Low-level wide char IO
WideTextIO TEXT_IO Text I/O on wide characters
WordN WORD N-bit words
Word8Array2 MONO_ARRAY2 2-dimensional arrays of 8-bit words
Unix UNIX Unix-like process invocation

Module dependencies

We specify certain relationships among the modules.

Backward compatability

To permit users to compile programs written under the old basis, we require that each implementation provide the structure SML90. This structure contains the top-level bindings specified in the 1990 version of the [CITE]Definition/, along with one or more substructures that define the top-level bindings of various implementations. For example, a user might write:

local
  open SML90 SML90.SMLNJ
in
  (* user's program *)
end
to compile a user's program under the old SML/NJ basis.

We expect that at some future point, the SML90 module will be deemed obsolete, and will be dropped from the standard basis.

Design rules and conventions

In designing the library, we have tried to follow a set of stylistic rules to make library usage consistent and predictable, and to preclude certain errors. These rules are not meant to be prescriptive for the programmer using or extending the library. On the other hand, although the library itself thwarts the conventions on occasion, we feel the rules are reasonable and helpful, and would encourage their use.

Orthographic conventions

We use a new set of spelling and capitalization conventions. Some of these conventions, e.g., the capitalization of value constructors, seem to be widely accepted in the user community. Other decisions were based less on dominant style or compelling reason than on compromise and the need for consistency and some sense of good taste.

The conventions we use are:

The above conventions concerning variable and constructor names, if followed consistently, can be used by a compiler to aid in detecting the subtle error in which a constructor is misspelled in a pattern-match and is thus treated as a variable binding. Some implementations may provide the option of enforcing these conventions by generating warning messages.

Naming

Similar values should have similar names, with similar type shapes, following the conventions outlined above. For example, the function Array.app has the type:

    val app : ('a -> unit) -> 'a array -> unit
which has the same shape as List.app. Names should be meaningful, but concise. We have broken this rule, however, in certain instances where previous usage seemed compelling. For example, we have kept the name app rather than adopt apply. More dramatically, we have purposely kept most of the traditional Unix names in the optional Posix modules, to capitalize on the familiarity of these names and the available documentation.

Comparisons

Many structures define a type ty along with a comparison function

    val compare : ty * ty -> order
plus the expected relational operators >, >=, < and <=. In all cases, the standard relationships hold between these functions. For example, we have x > y = true if and only if compare(x, y) = GREATER. If, in addition, ty is an equality type, we assume that the operators = and <> satisfy the usual relationships with compare and the relational operators. For example, if x = y, then compare(x,y) = EQUAL. Note that these assumptions are not quite true for real values; see the REAL signature for more details.

Types that have a standard or obvious linear order come with the full set of relational operators plus a compare function. Certain abstract types, e.g., OS.FileSys.file_id, provide a compare function for use with, for example, ordered binary trees.

Conversions

Most structures defining a type provide conversion functions to and from other types. When unambiguous, we use the naming convention toT and fromT, where T is some version of the name of the other type. For example, in WORD, we have

    val fromInt : Int.int -> word
    val toInt : word -> Int.int
If this naming is ambiguous (e.g., a structure defines multiple types that have conversions from integers), we use the convention TFromTT and TToTT. For example, in POSIX_PROC_ENV, we have
    val uidToWord : uid -> SysWord.word
    val gidToWord : gid -> SysWord.word

There should be conversions to and from strings for most types. Following the convention above, these functions are typically called toString and fromString. Usually, modules provide additional string conversion functions that allow more control over format and operate on an abstract character stream. These functions are called fmt and scan. The input accepted by fromString and scan consists of printable ASCII characters. The output generated by toString and fmt consists of printable ASCII characters.

We adopt the convention that conversions from strings should be forgiving, allowing initial white space and multiple formats, and ignoring additional terminating characters. On the other hand, we have tried to specify conversions to strings precisely. In addition, for basic types, scanning functions should accept legal SML literals, and formatting functions should, whenever possible, produce the value part of a valid SML literal but, for flexibility, may omit certain annotations. For example, String.toString produces a valid SML string constant, but without the enclosing quotes, and Word.toString produces a word constant without the "0wx" prefix.

Characters and strings

The old basis did not provide a character type, only a string type. To manipulate characters, programmers used integers corresponding to the character's code. This was unsatisfactory for several reasons:

Alternatively, programmers used strings of length one to represent characters, which is less efficient and cannot be enforced by the type system.

The revised SML Definition introduces a new char type and literal syntax along with old string type. The SML Standard Basis provides support for both string and char types, where the string type is a vector of characters. In addition, we define the optional types WideString.string and WideChar.char, in which the former is again a vector of the latter, for handling character sets more extensive than Latin-1.

Miscellany

Functional arguments that are evaluated solely for their side-effects should have a return type of unit. For example, the list application function should have the type:

   val app : ('a -> unit) -> 'a list -> unit

[ INDEX | TOP | Parent | Root ]

Last Modified August 5, 1997
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies