Author: | Albert Gräf <Dr.Graef@t-online.de> |
---|---|

Date: | 2010-09-30 |

Copyright (c) 2009-2010 by Albert Gräf. This document is available under the GNU Free Documentation License.

This manual describes the operations in the standard Pure library, including the prelude and the other library modules which come bundled with the interpreter.

There is a companion to this manual, the Pure Manual which describes the Pure language and the operation of the Pure interpreter.

Contents

- 1 Prelude
- 2 Mathematical Functions
- 3 Container Types
- 4 System Interface
- 5 Index

The prelude defines the basic operations of the Pure language. This
includes the basic arithmetic and logical operations, string, list and
matrix functions, as well as the support operations required to implement
list and matrix comprehensions. The string, matrix and record operations
are in separate modules strings.pure, matrices.pure and records.pure, the
primitive arithmetic and logical operations can be found in
primitives.pure. Note that since the prelude module gets imported
automatically (unless the interpreter is invoked with the `--no-prelude`
option), all operations discussed in this section are normally available in
Pure programs without requiring any explicit import declarations.

The prelude also declares a signature of commonly used constant and
operator symbols. This includes the truth values `true` and `false`.
These are actually just integers in Pure, but sometimes it's convenient to
refer to them using these symbolic constants. Note that if you also want to
use these on the left-hand side of equations, you still have to declare
them as `nonfix` symbols yourself, using a declaration like:

nonfix false true;

In addition, the following special exception symbols are provided:

- Built-in exception values:
`failed_cond`(failed conditional in guard or if-then-else),`failed_match`(failed pattern match in lambda,`case`expression, etc.),`stack_fault`(not enough stack space,`PURE_STACK`limit exceeded) and`malloc_error`(memory allocation error).

- Value mismatches a.k.a. dynamic typing errors:
`bad_string_value x`,`bad_matrix_value x`,`bad_list_value x`,`bad_tuple_value x`. These are thrown by some operations when they fail to find an expected value of the corresponding type.

- Index range exceptions:
`out_of_bounds`is thrown by the index operator`!`if a list, tuple or matrix index is out of bounds.

Here's the list of predefined operator symbols. Note that the parser will automagically give unary minus the same precedence level as the corresponding binary operator.

infixl 1000 $$ ; // sequence operator infixr 1100 $ ; // right-associative application infixr 1200 , ; // pair (tuple) infix 1300 => ; // key=>value pairs ("hash rocket") infix 1400 .. ; // arithmetic sequences infixr 1500 || ; // logical or (short-circuit) infixr 1600 && ; // logical and (short-circuit) prefix 1700 ~ ; // logical negation infix 1800 < > <= >= == ~= ; // relations infix 1800 === ~== ; // syntactic equality infixr 1900 : ; // list cons infix 2000 +: <: ; // complex numbers (cf. math.pure) infixl 2100 << >> ; // bit shifts infixl 2200 + - or ; // addition, bitwise or infixl 2300 * / div mod and ; // multiplication, bitwise and infixl 2300 % ; // exact division (cf. math.pure) prefix 2400 not ; // bitwise not infixr 2500 ^ ; // exponentiation prefix 2600 # ; // size operator infixl 2700 ! !! ; // indexing, slicing infixr 2800 . ; // function composition prefix 2900 ' ; // quote postfix 3000 & ; // thunk

The most important function combinators are `$` (right-associative
application) and `.` (function composition), which are also defined as
macros so that saturated calls of these are eliminated automatically.
Examples:

> foo $ bar 99; foo (bar 99) > (foo.bar) 99; foo (bar 99)

The customary identity and constant combinators from the combinatorial
calculus are also available, in Pure these are named `id` and `cst`,
respectively:

> map id (1..5); [1,2,3,4,5] > map (cst 0) (1..5); [0,0,0,0,0]

There's also a combinator `void` which is basically equivalent to `cst
()`, but with the special twist that it is also defined as a macro
optimizing the case of "throwaway" list and matrix comprehensions. This is
useful if a comprehension is evaluated solely for its side effects. E.g.:

> using system; > extern int rand(); > foo = void [printf "%d\n" rand | _ = 1..3]; > show foo foo = do (\_ -> printf "%d\n" rand) (1..3); > foo; 1714636915 1957747793 424238335 ()

Note that the above list comprehension is actually implemented using `do`
(instead of `map`, which would normally be the case), so that the
intermediate list value of the comprehension is never constructed. This is
described in more detail in section Optimization Rules of the Pure
Manual.

In addition, Pure also provides the following combinators adopted from Haskell:

`flip f`swaps arguments of a binary function`f`, e.g.:> map (flip (/) 2) (1..3); [0.5,1.0,1.5]

This combinator is also used by the compiler to implement right operator sections, which allows you to write the above simply as:

> map (/2) (1..3); [0.5,1.0,1.5]

`curry f`turns a function`f`expecting a pair of values into a curried function of two arguments:> using system; > dowith (curry (printf "%d: %g\n")) (0..2) [0.0,2.718,3.14]; 0: 0 1: 2.718 2: 3.14 ()

Conversely,

`uncurry f`turns a curried function`f`expecting two arguments into a function processing a single pair argument:> map (uncurry (*)) [(2,3),(4,5),(6,7)]; [6,20,42]

`curry3`and`uncurry3`are defined analogously, but are used to work with ternary functions.

The (normal order) fixed point combinator

`fix`allows you to create recursive anonymous functions. It takes another function`f`as its argument and applies`f`to`fix f`itself:> let fact = fix (\f n -> if n<=0 then 1 else n*f (n-1)); > map fact (1..5); [1,2,6,24,120]

See Fixed point combinator at Wikipedia for an explanation of how this magic works. Just like in Haskell,

`fix`can be used to produce least fixed points of arbitrary functions. For instance:> fix (cst bar); bar > let xs = fix (1:); > xs; 1:#<thunk 0x7fe537fe2f90> > xs!!(0..10); [1,1,1,1,1,1,1,1,1,1,1]

The prelude defines the list and tuple constructors (`x:y`, `x,y`), as
well as equality (`==`) and inequality (`~=`) on these structures. (The
latter compare two lists or tuples by recursively comparing their members,
so `==` must be defined on the list or tuple members if you want to use
these operations.) It also provides the predicate `null x` which tests
whether `x` is the empty list or tuple, the function `reverse x` which
reverses a list or tuple, and the operators `#x` (size of a list or
tuple), `x!i` (indexing), `x!!is` (slicing) and `x+y` (list
concatenation). Note that there isn't a separate operation for
concatenating tuples, since the pairing operator already does this:

> (1,2,3),(10,9,8); 1,2,3,10,9,8

This works because the `(,)` constructor is associative in Pure and will
always produce right-recursive pairs. This also implies that tuples are
always flat in Pure and can't be nested; if you need this, you should use
lists instead. Also note that the empty tuple `()` acts as a neutral
element with respect to `(,)`:

> (),(a,b,c); a,b,c > (a,b,c),(); a,b,c

The prelude also provides another special kind of pairs called "hash
pairs", which take the form `key=>value`. These are used in various
contexts to denote key-value associations. The only operations on hash
pairs provided by the prelude are equality testing (which recursively
compares the components) and the functions `key` and `val` which
extract the components of a hash pair:

> ("foo"=>99) == ("bar"=>99); 0 > key ("foo"=>99), val ("foo"=>99); "foo",99

Note that in difference to the tuple operator `(,)`, the "hash rocket"
`(=>)` is non-associative, so nested applications *must* be
parenthesized, and `(x=>y)=>z` is generally *not* the same as
`x=>(y=>z)`. Also note that `(,)` has lower precedence than `(=>)`,
so to include a tuple as key or value in a hash pair, the tuple must be
parenthesized, as in `"foo"=>(1,2)` (whereas `"foo"=>1,2` denotes a
tuple whose first element happens to be a hash pair).

Lists are the usual right-recursive aggregates, pretty much the same as in
Lisp or Prolog except that they use a Haskell-like syntax. In difference to
Haskell, list concatenation is denoted `+`, and lists may contain an
arbitrary mixture of arguments, i.e., they are fully polymorphic:

> 1:2:3:[]; [1,2,3] > [1,2,3]+[u,v,w]+[3.14]; [1,2,3,u,v,w,3.14]

Lists are **eager** in Pure by default, but they can also be made **lazy**, see
section Lazy Evaluation and Streams in the Pure Manual.

Arithmetic sequences can be constructed with the infix `..` operator:

> 1..5; [1,2,3,4,5] > 1:3..11; [1,3,5,7,9,11]

Note that the Pure syntax differs from Haskell in that there are no
brackets around the construct and a step width is indicated by specifying
the first two elements as `x:y` instead of `x,y`. Also, to specify
infinite sequences you have to use an infinite upper bound (`inf` or
`-inf`):

> 1:3..inf; 1:#<thunk 0x7f696cd2dbd8> > -1:-3..-inf; -1:#<thunk 0x7f696cd2fde8>

The lower bounds of an arithmetic sequence must always be finite.

Non-lazy lists can be sorted using the `sort` function, which provides an
interface to the C `qsort()` routine, please see your local qsort(3)
documentation for details. An order predicate `p` is specified as the
first argument, which must return a nonzero integer to indicate that its
first argument is "less than" its second argument. Example:

> sort (>) (1..10); [10,9,8,7,6,5,4,3,2,1] > sort (<) ans; [1,2,3,4,5,6,7,8,9,10]

You can convert between (finite) lists and tuples using the `list` and
`tuple` operations:

> tuple (1..5); 1,2,3,4,5 > list (a,b,c); [a,b,c]

The `list` function can also be used to turn a finite lazy list into an
eager one:

> list $ take 10 (-1:-3..-inf); [-1,-3,-5,-7,-9,-11,-13,-15,-17,-19]

You can also achieve the same effect somewhat more conveniently by slicing a finite part from a stream (see below):

> (-1:-3..-inf)!!(0..9); [-1,-3,-5,-7,-9,-11,-13,-15,-17,-19]

Conversely, it is also possible to convert a list to a stream:

> stream (1..10); 1:#<thunk 0x7fe537fe2b58>

This might appear a bit useless at first sight, since all elements of the stream are in fact already known. However, this operation then allows you to apply other functions to the list and have them evaluated in a lazy fashion.

Indexing of lists and tuples is always zero-based (i.e., indices run from
`0` to `#x-1`), and an exception will be raised if the index is out of
bounds:

> [1,2,3]!2; 3 > [1,2,3]!4; <stdin>, line 34: unhandled exception 'out_of_bounds' while evaluating '[1,2,3]!4'

The slicing operator `!!` takes a list or tuple and a list of indices and
returns the list or tuple of the corresponding elements, respectively.
Indices which are out of the valid range are silently ignored:

> (1..5)!!(3..10); [4,5] > (1,2,3,4,5)!!(3..10); 4,5

Indices can actually be specified in any order, so that you can retrieve any permutation of the members, also with duplicates. E.g.:

> (1..5)!![2,4,4,1]; [3,5,5,2]

This is less efficient than the case of contiguous index ranges (which is optimized so that it always works in linear time), because it requires repeated traversals of the list for each index. For larger lists you should hence use vectors or matrices instead, to avoid the quadratic complexity.

(The prelude actually implements the slicing operation in a fairly generic
way, so that it works with any kind of container data structure which
defines `!` in such a manner that it throws an exception when the index
is out of bounds. It also works with any kind of index container that
implements the `catmap` operation.)

This mostly comes straight from the Q prelude which in turn was based on the first edition of the Bird/Wadler book, and is very similar to what you can find in the Haskell prelude. Some functions have slightly different names, though, and of course everything is typed dynamically.

`any p xs`- tests whether the predicate
`p`holds for any of the members of`xs`

`all p xs`- tests whether the predicate
`p`holds for all of the members of`xs`

`cat xs`- concatenate a list of lists

`catmap f xs`- convenience function which combines
`cat`and`map`; this is also used to implement list comprehensions

`do f xs`- apply
`f`to all members of`xs`, like`map`, but throw away all intermediate results and return`()`

`drop n xs`- remove
`n`elements from the front of`xs`

`dropwhile p xs`- remove elements from the front of
`xs`while the predicate`p`is satisfied

`filter p xs`- return the list of all members of
`xs`satisfying the predicate`p`

`foldl f a xs`- accumulate the binary function
`f`over all members of`xs`, starting from the initial value`a`and working from the front of the list towards its end

`foldl1 f xs`- accumulate the binary function
`f`over all members of`xs`, starting from the value`head xs`and working from the front of the list towards its end;`xs`must be nonempty

`foldr f a xs`- accumulate the binary function
`f`over all members of`xs`, starting from the initial value`a`and working from the end of the list towards its front

`foldr1 f xs`- accumulate the binary function
`f`over all members of`xs`, starting from the value`last xs`and working from the end of the list towards its front;`xs`must be nonempty

`head xs`- return the first element of
`xs`;`xs`must be nonempty

`index xs x`search for an occurrence of

`x`in`xs`and return the index of the first occurrence, if any,`-1`otherwiseNote: This uses equality (

`==`) to decide whether a member of`xs`is an occurrence of`x`, so`==`must have an appropriate definition on the list members.

`init xs`- return all but the last element of
`xs`;`xs`must be nonempty

`last xs`- return the last element of
`xs`;`xs`must be nonempty

`listmap f xs`- convenience function which works like
`map`, but also deals with matrix and string arguments while ensuring that the result is always a list; this is primarily used to implement list comprehensions

`map f xs`- apply
`f`to each member of`xs`

`scanl f a xs`- accumulate the binary function
`f`over all members of`xs`, as with`foldl`, but return all intermediate results as a list

`scanl1 f xs`- accumulate the binary function
`f`over all members of`xs`, as with`foldl1`, but return all intermediate results as a list

`scanr f a xs`- accumulate the binary function
`f`over all members of`xs`, as with`foldr`, but return all intermediate results as a list

`scanr1 f xs`- accumulate the binary function
`f`over all members of`xs`, as with`foldr1`, but return all intermediate results as a list

`sort p xs`- Sorts the elements of the list
`xs`in ascending order according to the given predicate`p`, using the C`qsort`function. The predicate`p`is invoked with two arguments and should return a truth value indicating whether the first argument is "less than" the second. (An exception is raised if the result of a comparison is not a machine integer.)

`tail xs`- return all but the first element of
`xs`;`xs`must be nonempty

`take n xs`- take
`n`elements from the front of`xs`

`takewhile p xs`- take elements from the front of
`xs`while the predicate`p`is satisfied

Some useful (infinite) list generators, as well as some finite (and eager)
variations of these. The latter work like a combination of `take` or
`takewhile` and the former, but are implemented directly for better
efficiency.

`cycle xs`- cycles through the elements of the nonempty list
`xs`, ad infinitum

`cyclen n xs`- eager version of
`cycle`, returns the first`n`elements of`cycle xs`

`iterate f x`- returns the stream containing
`x`,`f x`,`f (f x)`, etc., ad infinitum

`iteraten n f x`- eager version of
`iterate`, returns the first`n`elements of`iterate f x`

`iterwhile p f x`- another eager version of
`iterate`, returns the list of all elements from the front of`iterate f x`for which the predicate`p`holds

`repeat x`- returns an infinite stream of
`x`s

`repeatn n x`- eager version of
`repeat`, returns a list with`n``x`s

`unzip xys`- takes a list of pairs to a pair of lists of corresponding elements

`unzip3 xyzs``unzip`with triples

`zip xs ys`- return the list of corresponding pairs
`(x,y)`where`x`runs through the elements of`xs`and`y`runs through the elements of`y`

`zip3 xs ys zs``zip`with three lists, returns a list of triples

`zipwith f xs ys`- apply the binary function
`f`to corresponding elements of`xs`and`ys`

`zipwith3 f xs ys zs`- apply the ternary function
`f`to corresponding elements of`xs`,`ys`and`zs`

Pure also has the following variations of `zipwith`/`zipwith3` which
throw away all intermediate results and return `()`. That is, these work
like `do` but pull arguments from two or three lists, respectively:

`dowith f xs ys`- apply the binary function
`f`to corresponding elements of`xs`and`ys`, return`()`

`dowith3 f xs ys zs`- apply the ternary function
`f`to corresponding elements of`xs`,`ys`and`zs`, return`()`

Pure strings are null-terminated character strings encoded in UTF-8, see the Pure Manual for details. The prelude provides various operations on strings, including a complete set of list-like operations, so that strings can be used mostly as if they were lists, although they are really implemented as C character arrays for reasons of efficiency. Pure also has some powerful operations to convert between Pure expressions and their string representation, see Eval and Friends for those.

Concatenation, indexing and slicing works just like with lists:

> "abc"+"xyz"; "abcxyz" > let s = "The quick brown fox jumps over the lazy dog."; > s!5; "u" > s!!(20..24); "jumps"

Checking for empty strings and determining the size of a string also works as expected:

> null ""; 1 > null s; 0 > #s; 44

You can search for the location of a substring in a string, and extract a substring of a given length:

`index s u`- Returns the (zero-based) index of the first occurrence of the substring
`u`in`s`, or -1 if`u`is not found in`s`.

`substr s i n`- Extracts a substring of (at most)
`n`characters at position`i`in`s`. This takes care of all corner cases, adjusting index and number of characters so that the index range stays confined to the source string.

Example:

> index s "jumps"; 20 > substr s 20 10; "jumps over"

Note that Pure doesn't have a separate type for individual characters.
Instead, these are represented as strings `c` containing exactly one
(UTF-8) character (i.e., `#c==1`). It is possible to convert such single
character strings to the corresponding integer character codes, and vice
versa:

`ord c`- Ordinal number of a single character string
`c`. This is the character's code point in the Unicode character set.

`chr n`- Converts an integer back to the character with the corresponding code point.

In addition, the usual character arithmetic works, including arithmetic sequences of characters, so that you can write stuff like the following:

> "a"-"A"; 32 > "u"-32; "U" > "a".."k"; ["a","b","c","d","e","f","g","h","i","j","k"]

Strings are also ordered lexicographically based on their character codes:

> "awe">"awesome"; 0 > "foo">="bar"; 1

For convenience, the prelude provides the following functions to convert between strings and lists (or other aggregates) of characters.

`chars s`,`list s`- Convert a string
`s`to a list of characters.

`tuple s`,`matrix s`- Convert a string
`s`to a tuple or (symbolic) matrix of characters, respectively.

`strcat xs`- Concatenate a list
`xs`of strings (in particular, this converts a list of characters back to a string).

`string xs`- Convert a list, tuple or (symbolic) matrix of strings to a string.
In the case of a list, this is synonymous with
`strcat`, but it also works with the other types of aggregates.

For instance:

> list "abc"; ["a","b","c"] > string ("a".."z"); "abcdefghijklmnopqrstuvwxyz"

The following functions are provided to deal with strings of "tokens" separated by a given delimiter string.

`split delim s`- Splits
`s`into a list of substrings delimited by`delim`.

`join delim xs`- Joins the list of strings
`xs`to a single string, interpolating the given`delim`string.

Example:

> let xs = split " " s; xs; ["The","quick","brown","fox","jumps","over","the","lazy","dog."] > join ":" xs; "The:quick:brown:fox:jumps:over:the:lazy:dog."

We mention in passing here that more elaborate string matching, splitting and replacement operations based on regular expressions are provided by the system module, see System Interface.

If that isn't enough already, most generic list operations carry over to
strings in the obvious way, treating the string like a list of characters.
(Some operations will actually return character lists in that case, so you
might have to apply `string` explicitly to convert these back to a
string.) For instance:

> filter (>="k") s; "qukrownoxumpsovrtlzyo" > string $ map pred "ibm"; "hal"

List comprehensions can draw values from strings, too:

> string [x+1 | x="HAL"]; "IBM"

The following routines are provided by the runtime to turn raw C `char*`
pointers (also called **byte strings** in Pure parlance, to distinguish them
from Pure's "cooked" UTF-8 string values) into corresponding Pure
strings. Normally you don't have to worry about this, because the C
interface already takes care of the necessary marshalling, but in some
low-level code these operations are useful. Also note that here and in the
following, the `cstring` routines also convert the string between the
system encoding and Pure's internal UTF-8 representation.

`string s`,`cstring s`- Convert a pointer
`s`to a Pure string.`s`must point to a null-terminated C string. These routines take ownership of the original string value, assuming it to be`malloc`ed, so you should only use these for C strings which are specifically intended to be freed by the user.

`string_dup s`,`cstring_dup s`- Convert a pointer
`s`to a Pure string. Like above, but these functions take a copy of the string, leaving the original C string untouched.

The reverse transformations are also provided. These take a Pure string to
a byte string (raw `char*`).

`byte_string s`,`byte_cstring s`- Construct a byte string from a Pure string
`s`. The result is a raw pointer object pointing to the converted string. The original Pure string is always copied (and, in the case of`byte_cstring`, converted to the system encoding). The resulting byte string is a`malloc`ed pointer which can be used like a C`char*`, and has to be freed explicitly by the caller when no longer needed.

Finally, it is also possible to convert Pure string lists to byte string
vectors and vice versa. These are useful if you need to pass an
`argv`-like string vector (i.e., a `char**` or `char*[]`) to C
routines. The computed C vectors are `malloc`ed pointers which have an
extra `NULL` pointer as the last entry, and should thus be usable for
almost any purpose which requires such a string vector in C. They also take
care of garbage-collecting themselves. The original string data is always
copied. As usual, the `cstring` variants do automatic conversions to the
system encoding.

`byte_string_pointer xs`,`byte_cstring_pointer xs`- Convert a list of Pure strings to a C
`char**`.

`string_list n p`,`cstring_list n p`- Convert a C
`char**`to a list of Pure strings.

Note that the back conversions take an additional first argument which
denotes the number of strings to retrieve. If you know that the vector is
`NULL`-terminated then this can also be an infinite value (`inf`) in
which case the number of elements will be figured out automatically.
Processing always stops at the first `NULL` pointer encountered.

The size of a matrix (number of elements) can be obtained using `#`, and
the `dim` function can be used to return its dimensions (number of rows
and columns):

> let x = {1,2,3;4,5,6}; #x; 6 > dim x; 2,3

`null x` can be used to check for empty matrices. Note that there are
various kinds of these, as a matrix may have zero rows or columns, or both.

Indexing and slicing works pretty much like in MATLAB and Octave, except
that the Pure operators `!` and `!!` are used (and indices are
zero-based). It is possible to access elements with a one-dimensional index
(in row-major oder):

> x!3; 4

Or you can specify a pair of row and column index:

> x!(1,0); 4

Slicing works accordingly. You can either specify a list of (one- or two-dimensional) indices, in which case the result is always a row vector:

> x!!(2..5); {3,4,5,6}

Or you can specify a pair of row and column index lists:

> x!!(0..1,1..2); {2,3;5,6}

The following abbreviations are provided to grab a slice from a row or column:

> x!!(1,1..2); {5,6} > x!!(0..1,1); {2;5}

Matrices can be compared with the `==` and `~=` operators which check
the dimensions and the matrix elements for equality:

> x == transpose x; 0

Most of the generic list operations are implemented on matrices, see
Common List Functions. Hence operations like `map` and `zipwith`
work as expected:

> map succ {1,2,3;4,5,6}; {2,3,4;5,6,7} > zipwith (+) {1,2,3;4,5,6} {1,0,1;0,2,0}; {2,2,4;4,7,6}

The matrix module also provides a bunch of other specialized matrix operations, including all the necessary operations for matrix comprehensions. We briefly summarize the most important operations below; please refer to matrices.pure for all the gory details. Also make sure you check Matrix Computations in the Pure Manual for some more examples, and the Record Functions section for an implementation of records using symbolic vectors.

The `matrix` function converts a list or tuple to a corresponding matrix.
`matrix` also turns a list of lists or matrices specifying the rows of
the matrix to the corresponding rectangular matrix; otherwise, the result
is a row vector. (In the former case, the `matrix` function may throw a
`bad_matrix_value x` exception in case of dimension mismatch, where `x`
denotes the offending submatrix.)

> matrix [1,2,3]; {1,2,3} > matrix [[1,2,3],[4,5,6]]; {1,2,3;4,5,6}

The `rowvector` and `colvector` functions work in a similar fashion,
but expect a list, tuple or matrix of elements and always return a row or
column vector, respectively (i.e., a 1xn or nx1 matrix, where n is the size
of the converted aggregate). Also, the `vector` function is a synonym for
`rowvector`. These functions can also be used to create recursive
(symbolic) matrix structures of arbitrary depth, which provide a nested
array data structure with efficient (constant time) element access.

> rowvector [1,2,3]; {1,2,3} > colvector [1,2,3]; {1;2;3} > vector [rowvector [1,2,3],colvector [4,5,6]]; {{1,2,3},{4;5;6}}

It's also possible to create a row or column vector from an arithmetic
sequence using the `rowvectorseq` and `colvectorseq` functions. Again,
`vectorseq` is provided as a synonym for `rowvectorseq`. These
operations are optimized for the case of int and double ranges.

> rowvectorseq 0 10 1; {0,1,2,3,4,5,6,7,8,9,10} > colvectorseq 0 10 1; {0;1;2;3;4;5;6;7;8;9;10} > vectorseq 0.0 0.9 0.1; {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}

These operations are also handy to define macros which optimize the construction of vectors from list ranges, so that the intermediate list values are actually never constructed. For instance:

> def vector (n1:n2..m) = vectorseq n1 m (n2-n1); > def vector (n..m) = vectorseq n m 1; > foo = vector (1..10); > bar = vector (0.0:0.1..0.9); > show foo bar bar = vectorseq 0.0 0.9 0.1; foo = vectorseq 1 10 1; > foo; bar; {1,2,3,4,5,6,7,8,9,10} {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}

(Note that the library doesn't contain any such macro rules, so you'll have to add them yourself if you want to take advantage of these optimizations.)

The `dmatrix`, `cmatrix`, `imatrix` and `smatrix` functions convert
a list or matrix to a matrix of the corresponding type (integer, double,
complex or symbolic). If the input is a list, the result is always a row
vector; this is usually faster than the `matrix` and `rowvector`
operations, but requires that the elements already are of the appropriate
type.

> imatrix [1,2,3]; {1,2,3} > dmatrix {1,2,3;4,5,6}; {1.0,2.0,3.0;4.0,5.0,6.0}

The `dmatrix`, `cmatrix` and `imatrix` functions can also be invoked
with either an int `n` or a pair `(n,m)` of ints as argument, in which
case they construct a zero rowvector or matrix with the corresponding
dimensions.

> imatrix 3; {0,0,0} > imatrix (2,3); {0,0,0;0,0,0}

There's also a number of more specialized conversions operations which are discussed under Matrix Inspection and Manipulation below.

Conversely, a matrix can be converted back to a flat list or tuple with the
`list` and `tuple` functions. In addition, the `list2` function
converts a matrix to a list of lists (one sublist for each row of the
matrix).

> tuple {1,2,3;4,5,6}; 1,2,3,4,5,6 > list {1,2,3;4,5,6}; [1,2,3,4,5,6] > list2 {1,2,3;4,5,6}; [[1,2,3],[4,5,6]] > list2 {1,2,3}; [[1,2,3]]

`dmatrixp x`,`cmatrixp x`,`imatrixp x`,`smatrixp x`,`nmatrixp x`- Check for different kinds of matrices (double, complex, int, symbolic and numeric, i.e., non-symbolic).

`vectorp x`,`rowvectorp x`,`colvectorp x`- Check for different kinds of vectors (these are just matrices with one row or column).

`stride x`- The stride of a matrix denotes the real row size of the underlying C
array, see the description of the
`pack`function below for further details. There's little use for this value in Pure, but it may be needed when interfacing to C.

`row x i`,`col x i`- Extract the
`i`th row or column of a matrix.

`rows x`,`cols x`- Return the list of all rows or columns of a matrix.

`diag x`,`subdiag x k`,`supdiag x k`:- Extract (sub-,super-) diagonals from a matrix. Sub- and super-diagonals
for
`k=0`return the main diagonal. Indices for sub- and super-diagonals can also be negative, in which case the corresponding super- or sub-diagonal is returned instead. In each case the result is a row vector.

`submat x (i,j) (n,m)`- Extract a submatrix of a given size at a given offset. The result shares
the underlying storage with the input matrix (i.e., matrix elements are
*not*copied) and so this is a comparatively cheap operation.

`rowcat xs`,`colcat xs`- Construct matrices from lists of rows and columns. These take either
scalars or submatrices as inputs; corresponding dimensions must match.
`rowcat`combines submatrices vertically, like`{x;y}`;`colcat`combines them horizontally, like`{x,y}`. Note: Like the built-in matrix constructs, these operations may throw a`bad_matrix_value x`exception in case of dimension mismatch, where`x`denotes the offending submatrix.

`matcat xs`- Construct a matrix from a (symbolic) matrix of other matrices and/or
scalars. This works like a combination of
`rowcat`and`colcat`, but draws its input from a matrix instead of a list of matrices, and preserves the overall layout of the "host" matrix. The net effect is that the host matrix is flattened out. If all elements of the input matrix are scalars already, the input matrix is returned unchanged.

`rowcatmap f xs`,`colcatmap f xs`- Combinations of
`rowcat`,`colcat`and`map`. These are used, in particular, for implementing matrix comprehensions.

`diagmat x`,`subdiagmat x k`,`supdiagmat x k`- Create a (sub-,super-) diagonal matrix from a row vector
`x`of size`n`. The result is always a square matrix with dimension`(n+k,n+k)`, which is of the same matrix type (double, complex, int, symbolic) as the input and has the elements of the vector on its`k`th sub- or super-diagonal, with all other elements zero. A negative value for`k`turns a sub- into a super-diagonal matrix and vice versa.

`re x`,`im x`,`conj x`- Extract the real and imaginary parts and compute the conjugate of a numeric matrix.

`pack x`,`packed x`- Pack a matrix. This creates a copy of the matrix which has the data in
contiguous storage. It also frees up extra memory if the matrix was
created as a slice from a bigger matrix (see
`submat`above) which has since gone the way of the dodo. The`packed`predicate can be used to verify whether a matrix is already packed. Note that even if a matrix is already packed,`pack`will make a copy of it anyway, so`pack`also provides a quick way to copy a matrix, e.g., if you want to pass it as an input/output parameter to a GSL routine.

`redim (n,m) x`,`redim n x`- Change the dimensions of a matrix without changing its size. The total
number of elements must match that of the input matrix. Reuses the
underlying storage of the input matrix if possible (i.e., if the matrix
is
`packed`). You can also redim a matrix to a given row size`n`. In this case the row size must divide the total size of the matrix.

`sort p x`Sorts the elements of a matrix (non-destructively, i.e., without changing the original matrix) according to the given predicate, using the C

`qsort`function. This works exactly the same as with lists (see Common List Functions), except that it takes and returns a matrix instead of a list. Note that the function sorts*all*elements of the matrix in one go (regardless of the dimensions), as if the matrix was a single big vector. The result matrix has the same dimensions as the input matrix. Example:> sort (<) {10,9;8,7;6,5}; {5,6;7,8;9,10}

If you'd like to sort the individual rows instead, you can do that as follows:

> sort_rows p = rowcat . map (sort p) . rows; > sort_rows (<) {10,9;8,7;6,5}; {9,10;7,8;5,6}

Likewise, to sort the columns of a matrix:

> sort_cols p = colcat . map (sort p) . cols; > sort_cols (<) {10,9;8,7;6,5}; {6,5;8,7;10,9}

Also note that the pure-gsl module provides an interface to the GSL routines for sorting numeric (int and double) vectors using the standard order. These will usually be much faster than

`sort`, whereas`sort`is more flexible in that it also allows you to sort symbolic matrices and to choose the order predicate.

`transpose x`Transpose a matrix. Example:

> transpose {1,2,3;4,5,6}; {1,4;2,5;3,6}

`rowrev x`,`colrev x`,`reverse x`- Reverse a matrix.
`rowrev`reverses the rows,`colrev`the columns,`reverse`both dimensions.

Last but not least, the matrix module also offers a bunch of low-level operations for converting between matrices and raw pointers. These are typically used to shovel around massive amounts of numeric data between Pure and external C routines, when performance and throughput is an important consideration (e.g., graphics, video and audio applications). The usual caveats concerning direct pointer manipulations apply.

`pointer x`- Get a pointer to the underlying C array of a matrix. The data is
*not*copied. Hence you have to be careful when passing such a pointer to C functions if the underlying data is non-contiguous; when in doubt, first use the`pack`function to place the data in contiguous storage, or use one of the matrix-pointer conversion routines below.

The following operations copy the contents of a matrix to a given pointer
and return that pointer, converting to the target data type on the fly if
necessary. The given pointer may also be `NULL`, in which case suitable
memory is malloced and returned; otherwise the caller must ensure that the
memory pointed to by `p` is big enough for the contents of the given
matrix.

`double_pointer p x``float_pointer p x``complex_pointer p x``complex_float_pointer p x``int_pointer p x``short_pointer p x``byte_pointer p x`

Conversely, the following functions allow you to create a numeric matrix
from a pointer, copying the data and converting it from the source type on
the fly if necessary. The source pointer `p` may also be `NULL`, in
which case the new matrix is filled with zeros instead. Otherwise the
caller must ensure that the pointer points to properly initialized memory
big enough for the requested dimensions. The given dimension may also be
just an integer `n` if a row vector is to be created.

`double_matrix (n,m) p``float_matrix (n,m) p``complex_matrix (n,m) p``complex_float_matrix (n,m) p``int_matrix (n,m) p``short_matrix (n,m) p``byte_matrix (n,m) p`

Finally, you can use the following operations to create a numeric matrix
view of existing data, without copying the data. The data must be double,
complex or int, the pointer must not be `NULL` and the caller must also
ensure that the memory persists for the entire lifetime of the matrix
object. The given dimension may also be just an integer `n` if a row
vector view is to be created.

`double_matrix_view (n,m) p``complex_matrix_view (n,m) p``int_matrix_view (n,m) p`

As of Pure 0.41, the prelude also provides a basic record data structure,
implemented as symbolic vectors of `key=>value` pairs which support a few
dictionary-like operations such as `member`, `insert` and indexing.
Records may be represented as row, column or empty vectors (i.e., the
number of rows or columns must be zero or one). They must be symbolic
matrices consisting only of "hash pairs" `key=>value`, where the keys can
be either symbols or strings. The values can be any kind of Pure data; in
particular, they may themselves be records, so records can be nested.

The following operations are provided. Please note that all updates of
record members are non-destructive and thus involve copying, which takes
linear time (and space) and thus might be slow for large record values; if
this is a problem then you should use dictionaries instead
(cf. Dictionaries). Or you can create mutable records by using expression
references (cf. Expression References) as values, which allow you to
modify the data in-place. Element lookup (indexing) uses binary search on
an internal index data structure and thus takes logarithmic time once the
index has been constructed (which is done automatically when needed, or
when calling `recordp` on a fresh record value).

Also note that records with duplicate keys are permitted; in such a case
the following operations will always operate on the *last* entry for a
given key.

`recordp x`- Check for record values.

`record x`- Normalizes a record. This removes duplicate keys and orders the record by
keys (using an apparently random but well-defined order of the key
values), so that normalized records are syntactically equal (
`===`) if and only if they contain the same hash pairs. For convenience, this function can also be used directly on lists and tuples of hash pairs to convert them to a normalized record value.

`#x`- The size of a record (number of entries it contains). Duplicate entries are counted. (This is in fact just the standard matrix size operation.)

`member x y`- Check whether
`x`contains the key`y`.

`x!y`- Retrieves the (last) value associated with the key
`y`in`x`, if any, otherwise throws an`out_of_bound`exception.

`x!!ys`- Slicing also works as expected, by virtue of the generic definition of slicing provided by the matrix data structure.

`insert x (y=>z)`,`update x y z`- Associate the key
`y`with the value`z`in`x`. If`x`already contains the key`y`then the corresponding value is updated (the last such value if`x`contains more than one association for`y`), otherwise a new member is inserted at the end of the record.

`delete x y`- Delete the key
`y`(and its associated value) from`x`. If`x`contains more than one entry for`y`then the last such entry is removed.

`keys x`,`vals x`- List the keys and associated values of
`x`. If the record contains duplicate keys, they are all listed in the order in which they are stored in the record.

Here are a few basic examples:

> let r = {x=>5, y=>12}; > r!y; r!![y,x]; // indexing and slicing 12 {12,5} > keys r; vals r; // keys and values of a record {x,y} {5,12} > insert r (x=>99); // update an existing entry {x=>99,y=>12} > insert ans (z=>77); // add a new entry {x=>99,y=>12,z=>77} > delete ans z; // delete an existing entry {x=>99,y=>12} > let r = {r,x=>7,z=>3}; r; // duplicate key x {x=>5,y=>12,x=>7,z=>3} > r!x, r!z; // indexing returns the last value of x 7,3 > delete r x; // delete removes the last entry for x {x=>5,y=>12,z=>3} > record r; // normalize (remove dups and sort) {x=>7,y=>12,z=>3} > record [x=>5, x=>7, y=>12]; // construct a normalized record from a list {x=>7,y=>12} > record (x=>5, x=>7, y=>12); // ... or a tuple {x=>7,y=>12}

More examples can be found in the Record Data section in the Pure Manual.

This prelude module is a collection of various lowlevel operations, which are implemented either directly by machine instructions or by C functions provided in the runtime. In particular, this module defines the basic arithmetic and logic operations on machine integers, bigints and floating point numbers, as well as various type checking predicates and conversions between different types. Some low-level pointer operations are also provided, as well as "sentries" (Pure's flavour of object finalizers) and "references" (mutable expression pointers).

The basic arithmetic and logic operations provided by this module are summarized in the following table:

Kind | Operator | Meaning |
---|---|---|

Arithmetic | + -
* /
div mod
^ |
addition, subtraction (also unary minus), multiplication, division (inexact), exact int/bigint division/modulus, exponentiation (inexact) |

Comparisons | == ~=
< >
<= >= |
equality, inequality, less than, greater than, less than or equal, greater than or equal |

Logic | ~
&& || |
logical not, and, or (short-circuit) |

Bitwise | not
and or
<< >> |
bitwise not, and, or, bit shifts |

Precedence and and associativity of the operators can be found in the operators table at the beginning of this section.

The names of some operations are at odds with C. Note, in particular, that
logical negation is denoted `~` instead of `!` (and, consequently,
`~=` denotes inequality, rather than `!=`), and the bitwise operations
are named differently. This is necessary because Pure uses `!`, `&` and
`|` for other purposes. Also, `/` always denotes inexact (double)
division in Pure, whereas the integer division operators are called `div`
and `mod`. (`%`, which is not defined by this module, also has a
different meaning in Pure; it's the exact division operator, see Rational
Numbers.)

The above operations are implemented for int, bigint and, where
appropriate, double and pointer operands. (Pointer arithmetic comprises
`+` and `-` and works in the usual way, i.e., `p-q` returns the byte
offset between two pointers `p` and `q`, and `p+n` or `p-n` offsets
a pointer `p` by the given integer `n` denoting the amount of bytes.)
The math module (see Mathematical Functions) also provides
implementations of the arithmetic and comparison operators for rational,
complex and complex rational numbers.

Note that the logical operations are actually implemented as special forms
in order to provide for short-circuit evaluation. This needs special
support from the compiler to work. The primitives module still provides
definitions for these, as well as other special forms like `quote` and
the thunking operator `&` so that they may be used as function values and
in partial applications, but when used in this manner they lose all their
special call-by-name properties; see Special Forms in the Pure Manual
for details.

The constants `inf` and `nan` are defined as the usual IEEE floating
point infinities and NaNs, and the predicates `infp` and `nanp` are
provided to check for these kinds of values.

In addition, the following arithmetic and numeric functions are provided:

`abs x`,`sgn x`- Absolute value and sign of a number.

`min x y`,`max x y`- Minimum and maximum of two values. This works with any kind of values which have the ordering relations defined on them.

`succ x`,`pred x`- Successor (
`+1`) and predecessor (`-1`) functions.

`gcd x y`,`lcd x y`- The greatest common divisor and least common multiple functions from the GMP library. These return a bigint if at least one of the arguments is a bigint, a machine int otherwise.

`pow x y`- Computes exact powers of ints and bigints. The result is always a
bigint. Note that
`y`must always be nonnegative here, but see the math module (Mathematical Functions) which deals with the case`y<0`using rational numbers.

These operations convert between various types of Pure values.

`hash x`- Compute a 32 bit hash code of a Pure expression.

`int x`,`bigint x`,`double x`,`pointer x`- Conversions between the different numeric and pointer types.

`ubyte x`,`ushort x`,`uint x`,`uint64 x`,`ulong x`- Convert signed (8/16/32/64) bit integers to the corresponding unsigned
quantities. These functions behave as if the value was "cast" to the
corresponding unsigned C type, and are most useful for dealing with
unsigned integers returned by external C routines. The routines always
use the smallest Pure int type capable of holding the result:
`int`for`ubyte`and`ushort`,`bigint`for`uint`,`uint64`and`ulong`. All routines take int parameters. In the case of`uint64`, a bigint parameter is also permitted (which is what the C interface returns for 64 bit values). Also note that`ulong`reduces to either`uint`or`uint64`, depending on the size of`long`for the host architecture.

The following rounding functions work with all kinds of numbers:

`floor x`,`ceil x`- Floor and ceil.

`round x`,`trunc x`- Round or truncate to an integer.

`frac x`- Fractional part (
`x-trunc x`).

Note that all these functions return double values for double arguments, so
if you need an integer result then you'll have to apply a suitable
conversion, as in `int (floor x)`.

A syntactic equality test is provided, as well as various type checking predicates.

`x===y`,`x~==y`,`same x y`- Syntactic equality. In difference to
`==`and`~=`this is defined on all Pure expressions. Basically, two expressions are syntactically equal if they print out the same in the interpreter. In the special case of pointer objects and closures, which do not have a syntactic representation in Pure,`x`and`y`must be the same object (same pointer value or function).

`intp x`,`bigintp x`,`doublep x`,`stringp x`,`pointerp x`,`matrixp x`- Predicates to check for the built-in types.

`charp x`- Single character string predicate.

`numberp x`,`complexp x`,`realp x`,`rationalp x`,`integerp x`- Additional number predicates.

`exactp x`,`inexactp x`- Check whether a number is exact (i.e., doesn't contain any double components).

`applp x`,`listp x`,`listnp x`,`tuplep x`- Predicates to check for function applications, proper lists, list nodes and proper tuples.

`funp x`,`lambdap x`,`thunkp x`,`varp x`- Predicates to check for various kinds of function objects (named, anonymous or thunk), and free variable symbols.

`atomp x`,`closurep x`- Convenience functions to check for atoms (named function or variable) and closures (named function or lambda).

The following operations let you peek at various internal information that the interpreter provides to Pure programs either for convenience or for metaprogramming purposes. They are complemented by the evaluation primitives discussed below, see Eval and Friends.

`ans`Retrieve the most recently printed result of a toplevel expression evaluated in the read-eval-print loop. This is just a convenience for interactive usage. Note that the

`ans`value will stick around until a new expression is computed. (It is possible to clear the`ans`value with the interactive command`clear ans`, however.) Example:> 1/3; 0.333333333333333 > ans/2; 0.166666666666667

`__locals__`Return a list with the local function bindings (

`with`clauses) visible at this point in the program. This is actually implemented as a built-in. The return value is a list of "hash" pairs`x=>f`where`x`is the (quoted) symbol denoting the function and`f`is the function itself (or its value, if`f`is a parameterless function). The`__locals__`function is useful for debugging purposes, as well as to implement dynamic environments. It is also used internally to implement the`reduce`macro, see Eval and Friends. Example:> __locals__ with foo x = x+1; x = a+b end; [x=>a+b,foo=>foo] > f 99 when _=>f = ans!1 end; 100

The prelude also provides some functions to retrieve various attributes of a function symbol which determine how the operation is applied to its operands or arguments. These functions all take a single argument, the symbol or function object to be inspected, and return an integer value.

`nargs x`- Get the argument count of a function object, i.e., the number of arguments it expects. Returns 0 for thunks and saturated applications, -1 for over-saturated applications and non-functions.

`arity x`- Determine the arity of an operator symbol. The returned value is 0, 1 or 2 for nullary, unary and binary symbols, respectively, -1 for symbols without a fixity declaration or other kinds of objects.

`fixity f`- Determine the fixity of an operator symbol. The fixity is encoded as an
integer
`10*n+m`where`n`is the precedence level (ranging from`0`to`PREC_MAX`, where`PREC_MAX`denotes the precedence of primary expressions, 16777216 in the current implementation) and`m`indicates the actual fixity at each level, in the order of increasing precedence (0 = infix, 1 = infixl, 2 = infixr, 3 = prefix, 4 = postfix). The fixity value of nonfix and outfix symbols, as well as symbols without a fixity declaration, is always given as`10*PREC_MAX`, and the same value is also reported for non-symbol objects. Infix, prefix and postfix symbols always have a`fixity`value less than`10*PREC_MAX`. (`PREC_MAX`isn't actually defined as a constant anywhere, but you can easily do that yourself by setting`PREC_MAX`to the fixity value of any nonfix symbol or non-symbol value, e.g.:`const PREC_MAX = fixity [];`)

Note that only closures (i.e., named and anonymous functions and thunks)
have a defined argument count in Pure, otherwise `nargs` returns -1
indicating an unknown argument count. Partial applications of closures
return the number of remaining arguments, which may be zero to indicate a
**saturated** (but unevaluated) application, or -1 for **over-saturated** and
constructor applications. (Note that in Pure a saturated application may
also remain unevaluated because there is no definition for the given
combination of arguments and thus the expression is in normal form, or
because the application was quoted. If such a normal form application is
then applied to some "extra" arguments it becomes over-saturated.)

The value returned by `nargs` always denotes the actual argument count of
the given function, regardless of the declared arity if the function also
happens to be an operator symbol. Often these will coincide (as, e.g., in
the case of `+` which is a binary operator and also expects two
arguments). But this is not necessarily the case, as shown in the following
example of a binary operator which actually takes *three* arguments:

> infix 0 oops; > (oops) x y z = x*z+y; > arity (oops); 2 > nargs (oops); 3 > nargs (5 oops 8); 1 > map (5 oops 8) (1..5); [13,18,23,28,33]

Pure provides some rather powerful operations to convert between Pure
expressions and their string representation, and to evaluate quoted
expressions (`'x`). The string conversions `str`, `val` and `eval`
also provide a convenient means to serialize Pure expressions, e.g., when
terms are to be transferred to/from persistent storage. (Note, however,
that this has its limitations. Specifically, some objects like pointers and
anonymous functions do not have a parsable string representation. Also see
the Expression Serialization section for some dedicated serialization
operations which provide a more compact binary serialization format.)

`str x`- Yields the print representation of an expression in Pure syntax, as a string.

`val s`- Parses a single simple expression, specified as a string in Pure syntax,
and returns the result as is, without evaluating it. Note that this is
much more limited than the
`eval`operation below, as the expression must not contain any of the special constructs (conditional expressions,`when`,`with`, etc.).

`eval x`- Parses any expression, specified as a string in Pure syntax, and returns
its value. In fact,
`eval`can also parse and execute arbitrary Pure code. In that case it will return the last computed expression, if any. Alternatively,`eval`can also be invoked on a (quoted) Pure expression, which is recompiled and then evaluated. Exceptions during evaluation are reported back to the caller.

`evalcmd x`- Like
`eval`, but allows execution of interactive commands and returns their captured output as a string. No other results are returned, so this operation is most useful for executing Pure definitions and interactive commands for their side-effects. (At this time, only the regular output of a few commands can be captured, most notably`bt`,`clear`,`mem`,`save`and`show`; otherwise the result string will be empty.)

`lasterr`- Reports errors in
`eval`and`evalcmd`. This string value will be nonempty iff a compilation or execution error was encountered during the most recent invokation of`eval`and`evalcmd`. In that case each reported error message is terminated with a newline character.

`reduce x`- Reevaluates an expression in a local environment. This dynamically
rebinds function symbols in the given expression to whatever local
definitions are in effect at the point of the
`reduce`call. Note that`reduce`is actually implemented as a macro which expands to the`reduce_with`primitive (see below), using the`__locals__`builtin to enumerate the bindings which are in effect at the call site.

`reduce_with env x`- Like
`reduce`above, but takes a list of replacements (given as hash pairs`u=>v`) as the first argument. The`reduce`macro expands to`reduce_with __locals__`.

Examples:

> str (1/3); "0.333333333333333" > val "1/3"; 1/3 > eval "1/3"; 0.333333333333333 > eval ('(1/3)); 0.333333333333333 > evalcmd "show evalcmd"; "extern expr* evalcmd(expr*);\n" > eval "1/3)"; eval "1/3)" > lasterr; "<stdin>, line 1: syntax error, unexpected ')', expecting '=' or '|'\n"

The `reduce` macro provides a restricted form of dynamic binding which is
useful to implement local rewriting rules. It is invoked without parameters
and expands to the curried call `reduce_with __locals__` of the
`reduce_with` primitive, which takes one additional argument, the
expression to be rewritten. The following example shows how to expand or
factorize an expression using local rules for the corresponding laws of
distributivity:

expand = reduce with (a+b)*c = a*c+b*c; a*(b+c) = a*b+a*c; end; factor = reduce with a*c+b*c = (a+b)*c; a*b+a*c = a*(b+c); end; expand ((a+b)*2); // yields a*2+b*2 factor (a*2+b*2); // yields (a+b)*2

Note that instances of locally bound functions are substituted back in the
computed result, thus the instances of `*` and `+` in the results
`a*2+b*2` and `(a+b)*2` shown above denote the corresponding globals,
not the local incarnations of `*` and `+` defined in `expand` and
`factor`, respectively.

`reduce` also adjusts to quoted arguments. In this case, the local rules
are applied as usual, but back-substituted globals are *not* evaluated in
the result:

> expand ((a+1)*2); a*2+2 > expand ('((a+1)*2)); a*2+1*2

Note that `reduce` only takes into account local *function* bindings from
`with` clauses, local *variable* bindings do not affect its operation in
any way:

> let y = [x,x^2,x^3]; > reduce y when x = u+v end; [x,x^2,x^3]

However, in such cases you can perform the desired substitution by turning
the `when` into a `with` clause:

> reduce y with x = u+v end; [u+v,(u+v)^2,(u+v)^3]

Or you can just invoke the underlying `reduce_with` builtin directly,
with the desired substitutions given as hash pairs in the first argument:

> reduce_with [x=>u+v] y; [u+v,(u+v)^2,(u+v)^3]

Like `str` and `eval`, the following `blob` and `val` operations
can be used to safely transfer expression data to/from persistent storage
and between different processes (using, e.g., POSIX shared memory, pipes or
sockets). However, `blob` and `val` use a binary format which is
usually much more compact and gets processed much faster than the string
representations used by `str` and `eval`. Also, `val` offers some
additional protection against transmission errors through a crc check. (The
advantage of the string representation, however, is that it's readable
plain text in Pure syntax.)

`blob x`- Stores the contents of the given expression as a binary object. The return value is a cooked pointer which frees itself when garbage-collected.

`val p`- Reconstructs a serialized expression from the result of a previous
invocation of the
`blob`function.

`blobp p`- Checks for a valid
`blob`object. (Note that`val`may fail even if`blobp`returns`true`, because for performance reasons`blobp`only does a quick plausibility check on the header information of the blob, whereas`val`also performs a crc check and verifies data integrity.)

`blob_size p`,`blob_crc p`- Determines the size (in bytes) and crc checksum of a blob, respectively.
`blob_size`always returns a bigint,`blob_crc`a machine int (use`uint`on the latter to get a proper unsigned 32 bit value). For convenience,`#p`is defined as an alias for`blob_size p`on`blob`pointers.

Example:

> let b = blob {"Hello, world!", 1/3, 4711, NULL}; > b; #b; uint $ blob_crc b; #<pointer 0x141dca0> 148L 3249898239L > val b; {"Hello, world!",0.333333333333333,4711,#<pointer 0>}

Please note that the current implementation has some limitations:

- Just as with
`str`and`eval`, runtime data (local closures and pointers other than the`NULL`pointer) can't be serialized, causing`blob`to fail. However, it*is*possible to transfer a global function, provided that the function exists (and is the same) in both the sending and the receiving process. (This condition can't be verified by`val`and thus is at the programmer's responsibilty.) - Sharing of subexpressions will in general be preserved, but sharing of
list and tuple
*tails*will be lost (unless the entire list or tuple is shared). - The
`val`function may fail to reconstruct the serialized expression even for valid blobs, if there is a conflict in symbol fixities between the symbol tables of the sending and the receiving process. To avoid this, make sure that symbol declarations in the sending and the receiving script match up.

`exit status`- Terminate the program with the given status code.

`force x`- Force a thunk (
`x&`), cf. Special Forms. This usually happens automagically when the value of a thunk is needed.

These are lowlevel operations dealing with pointer values. The usual
caveats apply, so *only* use these directly if you know what you're doing!

`addr symbol`- Get the address of a C symbol (given as a string) at runtime. The library containing the symbol must already be loaded. Note that this can in fact be any kind of externally visible C symbol, so it's also possible to get the addresses of global variables. The result is returned as a pointer. The function fails if the symbol was not found.

`calloc nmembers size`,`malloc size`,`realloc ptr size`,`free ptr`- Interface to
`malloc`,`free`and friends. These let you allocate dynamic buffers (represented as Pure pointer values) for various nasty purposes.

The following operations perform direct memory accesses. Use with care ... or else!

`get_byte ptr`,`get_short ptr`,`get_int ptr`,`get_int64 ptr`,`get_long ptr`,`get_float ptr`,`get_double ptr`,`get_string ptr`,`get_pointer ptr``put_byte ptr x`,`put_short ptr x`,`put_int ptr x`,`put_int64 ptr x`,`put_long ptr x`,`put_float ptr x`,`put_double ptr x`,`put_string ptr x`,`put_pointer ptr x`

Sentries are Pure's flavour of object **finalizers**. A sentry is simply an
object (usually a function) which gets applied to the target expression
when it is garbage-collected. This is useful to perform automatic cleanup
actions on objects with internal state, such as files. Pure's sentries are
*much* more useful than finalizers in other garbage-collected languages,
since it is guaranteed that they are called as soon as an object "goes out
of scope", i.e., becomes inaccessible.

`sentry f x`- Places a sentry
`f`at an expression`x`and returns the modified expression.

`clear_sentry x`- Removes the sentry from an expression
`x`.

`get_sentry x`- Returns the sentry of an expression
`x`(if any, fails otherwise).

Note that in the current implementation sentries can only be placed at symbols, applications, as well as function and pointer objects, but the department of fake statistics has assured us that this covers 99.7% of all practical uses. The sentry itself can be any type of object (but usually it's a function).

Example:

> using system; > sentry (\_->puts "Hello, world!") (1..3); [1,2,3] > clear ans Hello, world!

Note that setting a finalizer on a global symbol won't usually be of much
use since such values are cached by the interpreter. (However, the sentry
*will* be invoked if the symbol gets recompiled because its definition
has changed. This may be useful for some purposes.)

One common case where sentries are particularly useful are pointers. The library provides some special support for these by means of the following operations:

`cooked ptr`- Create a
**cooked**pointer which disposes itself after use. This is just a shorthand for`sentry free`. The given pointer`ptr`must be`malloc`ed to make this work.

`cookedp ptr`- Check whether a given pointer is cooked (this predicate actually assumes any pointer to be cooked which has a sentry set on it).

Example:

> using system; > let p = sentry (\p->puts "I'm done for!" $$ free p) (malloc 1024); > cookedp p; 1 > clear p I'm done for!

Besides their use as finalizers, sentries can also be handy in other circumstances, when you need to associate an expression with another, "invisible" value. In this case the sentry is usually some kind of data structure instead of a function to be executed at finalization time. For instance, here's how we can employ sentries to implement hashing of function values:

using dict; hashed f x = case get_sentry f of h::HDict = h!x if member h x; _ = y when y = f x; sentry (update h x y) f when h = case get_sentry f of h::HDict = h; _ = emptyhdict end; end; end; end;

E.g., consider the naive recursive definition of the Fibonacci function:

fib n::int = if n<=1 then 1 else fib (n-1)+fib (n-2);

A hashed version of the Fibonacci function can be defined as follows:

let hfib = hashed f with f n::int = if n<=1 then 1 else hfib (n-1)+hfib (n-2) end;

This turns the naive definition of the Fibonacci function (which has exponential time complexity) into a linear time operation:

> stats > fib 35; 14930352 4.53s > hfib 35; 14930352 0.25s

Finally, note that there can be only one sentry per expression but, building on the operations provided here, it's easy to design a scheme where sentries are chained. For instance:

chain_sentry f x = sentry (h (get_sentry x)) x with h g x = g x $$ f x; end;

This invokes the original sentry before the chained one:

> using system; > f _ = puts "sentry#1"; g _ = puts "sentry#2"; > let p = chain_sentry g $ sentry f $ malloc 10; > clear p sentry#1 sentry#2

You can chain any number of sentries that way:

> h _ = puts "sentry#3"; > let p = chain_sentry h $ chain_sentry g $ sentry f $ malloc 10; > clear p sentry#1 sentry#2 sentry#3

This scheme should work in most cases in which sentries are used just as
finalizers. However, there are other uses, like the "hashed function"
example above, where you'd like the original sentry to stay intact. This
can be achieved by placing the new sentry as a sentry on the *original
sentry* rather than the expression itself:

attach_sentry f x = sentry (sentry f (get_sentry x)) x;

Of course, this assumes that the original sentry is something that you can
place a sentry on. It also requires that the sentry will actually be
garbage-collected when its hosting expression gets freed, so it will *not*
work if the original sentry is a global:

> let p = chain_sentry g $ sentry f $ malloc 10; > clear p sentry#1

However, the attached sentry will work ok if you can ensure that the original sentry is a (partial or constructor) application, which also encompasses most kinds of aggregate Pure data. E.g.:

> let p = chain_sentry g $ sentry (f$) $ malloc 10; > clear p sentry#1 sentry#2

Expression references provide a kind of mutable data cells which can hold
any Pure expression. If you need these, then you're doomed. ;-) However,
they can be useful as a last resort when you need to keep track of some
local state or interface to the messy imperative world. Pure's references
are actually implemented as expression pointers so that you can readily
pass them as pointers to a C function which expects a `pure_expr**`
parameter. This may even be useful at times.

`ref x`- Create a reference pointing to
`x`initially.

`put r x`- Set a new value
`x`, and return that value.

`get r`- Retrieve the current value
`r`points to.

`unref r`- Purge the referenced object and turn the reference into a dangling pointer. (This is used as a sentry on reference objects and shouldn't normally be called directly.)

`refp x`- Predicate to check for reference values.

Note that manually removing the `unref` sentry of a reference turns the
reference into just a normal pointer object and renders it unusable as a
reference. Doing this will also leak memory, so don't!

The math.pure module provides Pure's basic math routines. It also defines complex and rational numbers.

To use the operations of this module, add the following import declaration to your program:

using math;

The module defines Euler's number `e = 2.71828...` and Ludolph's number
`pi = 3.1415...` as constants. It also provides a reasonably
comprehensive (pseudo) random number generator which uses the Mersenne
twister to avoid bad generators present in some C libraries.

Please note that as of Pure 0.41, the runtime library includes a newer
release of the Mersenne twister which fixes issues with some kinds of seed
values, and will yield different values for given seeds. Also, the
`random31` and `random53` functions have been added as a convenience to
compute unsigned 31 bit integers and 53 bit double values, and the
`srandom` function now also accepts an `int` matrix as seed value.

`random`- Return 32 bit pseudo random ints in the range
`-0x80000000..0x7fffffff`.

`random31`- Return 31 bit pseudo random ints in the range
`0..0x7fffffff`.

`random53`- Return pseudo random doubles in the range
`[0,1)`with 53 bits resolution.

`srandom seed`- Sets the seed of the generator to the given 32 bit integer. You can also
specify longer seeds using a nonempty row vector, e.g.:
`srandom {0x123, 0x234, 0x345, 0x456}`.

The following functions work with both double and int/bigint arguments. The result is always a double. For further explanations please see the descriptions of the corresponding functions from the C math library.

`sqrt x`- The square root function.

`exp x`,`ln x`,`log x`- Exponential function, natural and decadic logarithms.

`sin x`,`cos x`,`tan x`- Trigonometric functions.

`asin x`,`acos x`,`atan x`- Inverse trigonometric functions.

`atan2 y x`- Computes the arcus tangent of
`y/x`, using the signs of the two arguments to determine the quadrant of the result.

`sinh x`,`cosh x`,`tanh x`- Hyperbolic trigonometric functions.

`asinh x`,`acosh x`,`atanh x`- Inverse hyperbolic trigonometric functions.

We provide both rectangular (`x+:y`) and polar (`r<:a`)
representations, where `(x,y)` are the Cartesian coordinates and
`(r,t)` the radius (absolute value) and angle (in radians) of a complex
number, respectively. The `+:` and `<:` constructors (declared in the
prelude) bind weaker than all other arithmetic operators and are
non-associative.

The polar representation `r<:t` is normalized so that `r` is always
nonnegative and `t` falls in the range `-pi<t<=pi`.

The constant `i` is provided to denote the imaginary unit `0+:1`.

The arithmetic operations `+`, `*` etc. and the equality relations
`==` and `~=` work as expected, and the square root, exponential,
logarithms, trigonometric and hyperbolic trigonometric functions (see
Basic Math Functions) are extended to complex numbers accordingly. These
do *not* rely on complex number support in the C library, but should still
conform to IEEE 754 and POSIX, provided that the C library provides a
standards-compliant implementation of the basic math functions.

The following operations all work with both the rectangular and the polar representation, promoting real (double, int/bigint) inputs to complex where appropriate. When the result of an operation is again a complex number, it generally uses the same representation as the input (except for explicit conversions). Mixed rect/polar and polar/rect arithmetic always returns a rect result, and mixed complex/real and real/complex arithmetic yields a rect or polar result, depending on what the complex input was.

`complex x`- Convert any kind of number to a complex value.

`polar z`,`rect z`- Convert between polar and rectangular representations.

`cis t`- Create complex values on the unit circle. Note: To quickly compute
`exp (x+:y)`in polar form, use`exp x <: y`.

`abs z`,`arg z`- Modulus (absolute value) and argument (angle, a.k.a. phase). Note that you can also find both of these in one go by converting to polar form.

`re z`,`im z`- Real and imaginary part.

`conj z`- Complex conjugate.

Examples:

> using math; > let z = 2^(1/i); z; 0.769238901363972+:-0.638961276313635 > let z = ln z/ln 2; z; 0.0+:-1.0 > abs z, arg z; 1.0,-1.5707963267949 > polar z; 1.0<:-1.5707963267949

Please note that, as the `+:` and `<:` constructors bind weaker than
the other arithmetic operators, complex numbers *must* be parenthesized
accordingly, e.g.:

> (1+:2)*(3+:4); -5+:10

Pure's rational numbers are constructed with the **exact division** operator
`%` (declared in the prelude) which has the same precedence and fixity as
the other division operators.

The `%` operator returns a rational or complex rational for any
combination of integer, rational and complex integer/rational arguments,
provided that the denominator is nonzero (otherwise it behaves like `x div
0`, which will raise an exception). Machine int operands are always
promoted to bigints, thus normalized rationals always take the form `x%y`
where both the numerator `x` and the denominator `y` are bigints. For
other numeric operands `%` works just like `/`. Rational results are
normalized so that the sign is always in the numerator and numerator and
denominator are relatively prime. In particular, a rational zero is always
represented as `0L%1L`.

The usual arithmetic operations and equality/order relations are extended accordingly, as well as the basic math functions and the rounding functions, and will return exact (rational or complex rational) results where appropriate. Rational operations are implemented using the GMP bigint functions where possible, and thus are reasonably fast.

In addition, the module also provides following operations:

`rational x`Converts a real or complex value

`x`to a rational or complex rational. Note that the conversion from double values doesn't do any rounding, so it is guaranteed that converting the resulting rational back to a double reconstructs the original value.Conversely, the

`int`,`bigint`,`double`,`complex`,`rect`,`polar`and`cis`conversion functions are overloaded so that they convert a rational to one of the other number types.

`num x`,`den x`- Numerator and denominator of a rational
`x`.

Examples:

> using math; > 5%7 + 2%3; 29L%21L > 3%8 - 1%3; 1L%24L > pow (11%10) 3; 1331L%1000L > let x = pow 3 (-3); x; 1L%27L > num x, den x; 1L,27L > rational (3/4); 3L%4L

Note that doubles can't represent most rationals exactly, so conversion
from double to rational *will* yield funny results in many cases (which are
still accurate up to rounding errors). For instance:

> let x = rational (1/17); x; 4238682002231055L%72057594037927936L > num x/den x; 0.0588235294117647 > double (1%17); 0.0588235294117647

In difference to the syntactic predicates in Primitives, these check
whether the given value can be represented as an object of the given target
type (up to rounding errors). Note that if `x` is of syntactic type
`X`, then it is also of semantic type `X`. Moreover, `intvalp x =>
bigintvalp x => ratvalp x => realvalp x => compvalp x <=> numberp x`.

`realvalp x`- Check for real values (
`im x==0`).

`ratvalp x`- Check for rational values (same as
`realvalp`, except that IEEE 754 infinities and NaNs are excluded).

`bigintvalp x`- Check for "big" integer values which can be represented as a bigint.

`intvalp x`- Check for "small" integer values which can be represented as a machine int.

The standard library provides a variety of efficient container data structures for different purposes. These are all purely functional, i.e., immutable data structures implemented using different flavours of binary trees. This means that instead of modifying a data structure in-place, operations like insertion and deletion return a new instance of the container, keeping the previous instance intact. Nevertheless, all operations are performed efficiently, in logarithmic time where possible.

The container types are all implemented as abstract data structures, so client modules shouldn't rely on the internal representation. Each type provides a corresponding type tag (cf. Type Tags in the Pure Manual), as given in the "Data Structure" section of each type, which can be used to match values of the type, e.g.:

shift a::Array = rmfirst a;

All container types implement the equality predicates `==` and `~=` by
recursively comparing their members. In addition, the dictionary, set and
bag data structures also provide the other comparison predicates (`<`,
`<=` etc.) which check whether one dictionary, set or bag is contained in
another.

The array.pure module implements an efficient functional array data structure which allows to access and update individual array members, as well as to add and remove elements at the beginning and end of an array. All these operations are carried out in logarithmic time.

To use the operations of this module, add the following import declaration to your program:

using array;

Arrays are represented as balanced tree structures of the form `Array T`,
thus `Array` may be used as a type tag to restrict variables in pattern
matching.

The internal data structure `T` is a binary tree built with the following
constructors:

`array::nil`- the empty tree.
`array::tip value`- a leaf of the tree, holding the given value.
`array::bin balance left right`- a nonempty tree with the given balance flag (0 or 1) and left and right subtrees.

(This is for informational purposes only. The tree constructors are private, and client modules must not rely on the internal representation.)

`emptyarray`- return the empty array

`array xs`- create an array from a list
`xs`

`array2 xs`- create a two-dimensional array from a list of lists

`mkarray x n`- create an array consisting of
`n``x`'s

`mkarray2 x (n,m)`- create a two-dimensional array of
`n*m``x`'s

`arrayp x`- check whether
`x`is an array

`#a`- size of
`a`

`a!i`- return the
`i`th member of`a` `a!(i,j)`- two-dimensional subscript

`null a`- test whether
`a`is the empty array

`members a`,`list a`- list of values stored in
`a`

`members2 a`,`list2 a`- list of members in a two-dimensional array

`first a`,`last a`- first and last member of
`a`

`rmfirst a`,`rmlast a`- remove first and last member from
`a`

`insert a x`- insert
`x`at the beginning of`a`

`append a x`- append
`x`to the end of`a`

`update a i x`- replace the
`i`th member of`a`by`x`

`update2 a (i,j) x`- update two-dimensional array

Import the module:

> using array;

A one-dimensional array:

> let a::Array = array (0.0:0.1..1.0); > #a; members a; 11 [0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]

Indexing an array works in the usual way, using Pure's `!` operator. By
virtue of the prelude, slicing an array with `!!` also works as
expected:

> a!5; 0.5 > a!!(3..7); [0.3,0.4,0.5,0.6,0.7]

Updating a member of an array produces a new array:

> let b::Array = update a 1 2.0; > members b; [0.0,2.0,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]

Two-dimensional arrays can be created with `array2` from a list of
lists:

> let a2::Array = array2 [[i,x | x = [u,v,w]] | i = 1..2]; > members2 a2; [[(1,u),(1,v),(1,w)],[(2,u),(2,v),(2,w)]] > a2!(1,2); 2,w > a2!![(0,1),(1,2)]; [(1,v),(2,w)] > a2!!(0..1,1..2); [[(1,v),(1,w)],[(2,v),(2,w)]]

Here's how to convert an array to a Pure matrix:

> matrix $ members a; {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0} > matrix $ members2 a2; {(1,u),(1,v),(1,w);(2,u),(2,v),(2,w)}

Converting back from a matrix to an array:

> let b2::Array = array2 $ list2 {(1,u),(1,v),(1,w);(2,u),(2,v),(2,w)}; > members2 b2; [[(1,u),(1,v),(1,w)],[(2,u),(2,v),(2,w)]]

Heaps are a kind of priority queue data structure which allows quick (constant time) access to the smallest member, and to remove the smallest member and insert new elements in logarithmic time. Our implementation does not allow quick update of arbitrary heap members; if such functionality is required, bags can be used instead (see Sets and Bags).

Heap members *must* be ordered by the `<=` predicate. Multiple instances
of the same element may be stored in a heap; however, the order in which
equal elements are retrieved is not specified.

To use the operations of this module, add the following import declaration to your program:

using heap;

Heaps are represented as balanced tree structures of the form `Heap T`,
thus `Heap` may be used as a type tag to restrict variables in pattern
matching.

The internal data structure `T` is a binary tree built with the following
constructors:

`heap::nil`- the empty tree.
`heap::bin balance value left right`- a nonempty tree holding the given value at its root node, with the given balance flag (0 or 1) and left and right subtrees.

(This is for informational purposes only. The tree constructors are private, and client modules must not rely on the internal representation.)

`emptyheap`- return the empty heap

`heap xs`- create a heap from a list
`xs`

`heapp x`- check whether
`x`is a heap

`#h`- size of a heap

`null h`- test whether
`h`is the empty heap

`members h`,`list h`- list the members of
`h`in ascending order

`first h`- the first (i.e., smallest) member of
`h`

`rmfirst h`- remove the first (i.e., smallest) member from
`h`

`insert h x`- insert
`x`into`h`

> let h::Heap = heap [5,1,3,11,3]; > members h; [1,3,3,5,11] > first h; 1 > members $ rmfirst h; [3,3,5,11]

The dict.pure module provides Pure's dictionary data types based on AVL trees. There are actually four different types to choose from, depending on whether you need ordered or hashed dictionaries and whether multiple values for the same key should be allowed or not.

`Dict`is an ordered dictionary. This assumes an ordered key type, i.e., the predicate`<`must be defined on the keys.`HDict`a hashed dictionary which works with any (mixture of) key types but stores members in an apparently random order.`MDict`is an ordered dictionary, like`Dict`, which allows multiple values to be associated with the same key.`HMDict`is a multi-valued dictionary, like`MDict`, but uses hashed keys like`HDict`.

`MDict` and `HMDict` are also colloquially referred to as (ordered or
hashed) *multidicts*. This implementation guarantees that different members
for the same key are always kept in the order in which they were inserted,
and this is also the order in which they will be retrieved by the
`members`, `keys`, `vals` and indexing operations.

The usual comparison predicates (`==`, `~=`, `<=`, `<` etc.) are
defined on all dictionary types, where two dictionaries are considered
"equal" (`d1==d2`) if they both contain the same `key=>value` pairs,
and `d1<=d2` means that `d1` is a sub-dictionary of `d2`, i.e., all
`key=>value` pairs of `d1` are also contained in `d2` (taking into
account multiplicities in the multidict case). Ordered dictionaries compare
keys using equality (assuming two keys `a` and `b` to be equal if
neither `a<b` nor `b<a` holds), while hashed dictionaries check for
syntactical equality (using `===`). The associated values are always
compared using the `==` predicate, so this needs to be defined if you
want to use any of the comparison operations.

The underlying AVL tree data structure can be found in the avltrees.pure module which is included in the library, but not to be invoked directly.

The AVL tree algorithm has its origin in the SWI-Prolog implementation of association lists. The original implementation was created by R. A. O'Keefe and updated for SWI-Prolog by Jan Wielemaker. For the original source see http://www.swi-prolog.org.

The port from SWI-Prolog and the deletion stuff (`rmfirst`, `rmlast`,
`delete`) missing in the Prolog implementation was provided by Jiri
Spitz. The generalization of the code to arbitrary combinations of
ordered/hashed and single-/multi-valued keys was done by Albert Graef.

To use the operations of this module, add the following import declaration to your program:

using dict;

Dictionaries are represented as balanced tree structures of the form `D
T` where `D` may be any of `Dict`, `HDict`, `MDict` and
`HMDict`. These constructors may be used as type tags to restrict
variables in pattern matching.

In any case, the internal data structure `T` is an AVL tree built with
the following constructors (from the avltrees module):

`avl::nil`- the empty tree.
`avl::bin (key=>value) balance left right`- a nonempty tree with given
`key`and`value`in the root node, where`left`and`right`are the left and right subtree, and`balance`is either 1, 0 or -1, denoting`|left|-|right|`= 1, 0, or -1, respectively.

(This is for informational purposes only. The tree constructors are private, and client modules must not rely on the internal representation.)

`emptydict`,`emptyhdict`,`emptymdict`,`emptyhmdict`- return an empty dictionary

`dict xs`,`hdict xs`,`mdict xs`,`hmdict xs`- create a dictionary of the corresponding type either from a list
`xs`of key-value pairs in the form`key=>value`, or from another dictionary; in the latter case the argument is converted to a dictionary of the desired target type

`dictp d`,`hdictp d`,`mdictp d`,`hmdictp d`- check whether
`x`is a dictionary of the corresponding type

`mkdict y xs`,`mkhdict y xs`,`mkmdict y xs`,`mkhmdict y xs`- create a dictionary from a list of keys and a constant value

`d1+d2`- union:
`d1+d2`adds the members of`d2`to`d1`

`d1-d2`- difference:
`d1-d2`removes the members of`d2`from`d1`

`d1*d2`- intersection:
`d1*d2`removes the members*not*in`d2`from`d1`

`#d`- size of a dictionary (the number of members it contains)

`d!x`- get the value from
`d`by key`x`; in the case of a multidict this actually returns a list of values (which may be empty if`d`doesn't contain`x`)

`null d`- test whether
`d`is an empty dictionary

`member d x`- test whether
`d`contains a member with key`x`

`members d`,`list d`- list the members of
`d`(in ascending order for ordered dictionaries)

`keys d`- list the keys of
`d`(in ascending order for ordered dictionaries)

`vals d`- list the values of
`d`

`first d`,`last d`- return the first and the last member of
`d`, respectively

`rmfirst d`,`rmlast d`- remove the first and the last member from
`d`, respectively

`insert d (x=>y)`,`update d x y`- insert
`x=>y`into`d`(this always adds a new member in a multidict, otherwise it replaces an existing value if there is one); note that`update`is just a fully curried version of`insert`, so`update d x y`behaves exactly like`insert d (x=>y)`

`delete d x`- remove
`x`from`d`if present (in the multidict case, only the first member with the given key`x`is removed)

`delete_val d (x=>y)`- remove a specific key-value pair
`x=>y`from`d`if present (in the multidict case, only the first instance of`x=>y`is removed); please also see the notes below regarding this operation

`delete_all d x`- remove all instances of
`x`from`d`(in the non-multidict case, this is just the same as`delete`)

Notes:

- The infix operators
`+`,`-`and`*`work like the corresponding set and bag operations (see Sets and Bags), treating dictionaries as collections of`key=>val`pairs. You can mix arbitrary operand types with these operations, as well as with the comparison operations; the necessary conversions from less general dictionary types (ordered, single-valued) to more general types (hashed, multi-valued) are handled automatically. - The
`delete_val`function always compares values using equality (`==`), so this predicate must be defined on the values stored in the dictionary in order to make this operation work. If there is more than one instance of the given value under the given key, the first such instance will be removed (which may be any instance that compares equal, not necessarily an exact match). - In the multidict case,
`delete_val`may require linear time with respect to the number of different values stored under the given key. Since this operation is also needed to implement some other multidict operations like comparisons, difference and intersection, these may end up requiring quadratic running times in degenerate cases (i.e., if the majority of members happens to be associated with only very few keys).

A normal (ordered) dictionary:

> using dict; > let d::Dict = dict ["foo"=>77,"bar"=>99.1]; > keys d; vals d; members d; ["bar","foo"] [99.1,77] ["bar"=>99.1,"foo"=>77]

Indexing a dictionary works in the usual way, using Pure's `!` operator.
An `out_of_bound` exception is thrown if the key is not in the
dictionary:

> d!"foo"; 77 > d!"baz"; <stdin>, line 5: unhandled exception 'out_of_bounds' while evaluating 'd!"baz"'

By virtue of the prelude, slicing a dictionary with `!!` also works as
expected:

> d!!["foo","bar","baz"]; [77,99.1]

A hashed dictionary can be used with any key values, which are stored in a seemingly random order:

> let h::HDict = hdict [foo=>77,42=>99.1]; > keys h; vals h; members h; [42,foo] [99.1,77] [42=>99.1,foo=>77] > h!foo; 77 > h!!keys h; [99.1,77]

Multidicts work in pretty much the same fashion, but allow more than one
value for a given key to be stored in the dictionary. In this case, the
indexing operation returns a list of all values for the given key, which
may be empty if the key is not in the dictionary (rather than throwing an
`out_of_bound` exception):

> let d::MDict = mdict ["foo"=>77,"bar"=>99.1,"foo"=>99]; > d!"foo"; d!"baz"; [77,99] []

Slicing thus returns a list of lists of values here:

> d!!["foo","bar","baz"]; [[77,99],[99.1],[]]

To obtain a flat list you can just concatenate the results:

> cat $ d!!["foo","bar","baz"]; [77,99,99.1]

Hashed multidicts provide both key hashing and multiple values per key:

> let h::HMDict = hmdict [foo=>77,42=>99.1,42=>77]; > keys h; vals h; members h; [42,42,foo] [99.1,77,77] [42=>99.1,42=>77,foo=>77] > h!42; [99.1,77]

There are also some set-like operations which allow you to add/remove the
members (`key=>val` pairs) of one dictionary to/from another dictionary,
and to compute the intersection of two dictionaries. For instance:

> let h1 = hmdict [a=>1,b=>2]; > let h2 = hmdict [b=>2,c=>3]; > members (h1+h2); [a=>1,c=>3,b=>2,b=>2] > members (h1-h2); [a=>1] > members (h1*h2); [b=>2]

It's possible to mix dictionaries of different types in these operations. The necessary conversions are handled automatically:

> let h1 = hmdict [a=>1,b=>2]; > let h2 = hdict [b=>3,c=>4]; > members (h1+h2); [a=>1,c=>4,b=>2,b=>3]

Note that the result will always be promoted to the most general operand type in such cases (a hashed multidict in the above example). If this is not what you want, you'll have to apply the necessary conversions manually:

> members (hdict h1+h2); [a=>1,c=>4,b=>3]

The set.pure module implements Pure's set data types based on AVL trees. These work pretty much like dictionaries (cf. Dictionaries) but only store keys (called "elements" or "members" here) without any associated data values. Hence sets provide membership tests like dictionaries, but no indexing operations.

There are four variations of this data structure to choose from, depending
on whether the set members are ordered or hashed, and whether multiple
instances of the same element are allowed (in this case the set is actually
called a *multiset* or a *bag*).

`Set`and`Bag`implement the ordered set types. They require that members be ordered, i.e., the predicate`<`must be defined on them.`HSet`and`HBag`implement the hashed set types which don't require an order of the members. Distinct members are stored in an apparently random order.

The usual comparison predicates (`==`, `~=`, `<=`, `<` etc.) are
defined on all set and bag types, where two sets or bags are considered
"equal" (`m1==m2`) if they both contain the same elements, and `m1<=m2`
means that `m1` is a subset or subbag of `m2`, i.e., all elements of
`m1` are also contained in `m2` (taking into account multiplicities in
the multiset case). Ordered sets and bags compare elements using equality
(considering two elements `a` and `b` to be equal if neither `a<b`
nor `b<a` holds), while hashed sets and bags check for syntactical
equality (using `===`).

The underlying AVL tree data structure can be found in the avltrees.pure module which is included in the library, but not to be invoked directly. The AVL tree algorithm has its origin in the SWI-Prolog implementation of association lists and was ported to Pure by Jiri Spitz, see Dictionaries for details.

To use the operations of this module, add the following import declaration to your program:

using set;

Sets and bags are represented as balanced tree structures of the form `S
T`, where `S` may be any of `Set`, `Bag`, `HSet` and `HBag`.
These constructors may be used as type tags to restrict variables in
pattern matching.

In any case, the internal data structure `T` is an AVL tree built with
the following constructors (from the avltrees module):

`avl::nil`- the empty tree.
`avl::bin key balance left right`- a nonempty tree with given
`key`(set element) in the root node, where`left`and`right`are the left and right subtree, and`balance`is either 1, 0 or -1, denoting`|left|-|right|`= 1, 0, or -1, respectively.

`emptyset`,`emptybag`,`emptyhset`,`emptyhbag`- return an empty set or bag

`set xs`,`bag xs`,`hset xs`,`hbag xs`- create a set or bag of the corresponding type from a list or another set
or bag
`xs`; in the latter case the argument is converted to a set or bag of the desired target type

`setp x`,`bagp x`,`hsetp x`,`hbagp x`- check whether
`x`is a set or bag of the corresponding type

`m1+m2`- set and bag union:
`m1+m2`adds the members of`m2`to`m1`

`m1-m2`- set and bag difference:
`m1-m2`removes the members of`m2`from`m1`

`m1*m2`- set and bag intersection:
`m1*m2`removes the members*not*in`m2`from`m1`

`#m`- size of a set or bag
`m`

`null m`- test whether
`m`is an empty set or bag

`member m x`- test whether
`m`contains`x`

`members m`,`list m`- list the members of
`m`(in ascending order for ordered sets and bags)

`first m`,`last m`- return the first and the last member of
`m`, respectively

`rmfirst m`,`rmlast m`- remove the first and the last member from
`m`, respectively

`insert m x`- insert
`x`into`m`(replaces an existing element in the case of a set)

`delete m x`- remove
`x`from`m`(in the`bag`case, only the first instance of`x`is removed)

`delete_all m x`- remove all instances of
`x`from`m`(in the set case, this is just the same as`delete`)

Note that the infix operators (`+`, `-`, `*`, as well as the
comparison operations) allow you to mix arbitrary operand types; the
necessary conversions from less general set types (ordered, set) to more
general types (hashed, multiset) are handled automatically.

Some basic set operations:

> let m::Set = set [5,1,3,11,3]; > members m; [1,3,5,11] > map (member m) (1..5); [1,0,1,0,1] > members $ m+set (3..6); [1,3,4,5,6,11] > members $ m-set (3..6); [1,11] > members $ m*set (3..6); [3,5]

The bag operations work in a similar fashion, but multiple instances are permitted in this case, and each instance counts as a separate member:

> let m::Bag = bag [5,1,3,11,3]; > members m; [1,3,3,5,11] > members $ delete m 3; [1,3,5,11] > members $ insert m 1; [1,1,3,3,5,11] > members $ m+bag (3..6); [1,3,3,3,4,5,5,6,11] > members $ m-bag (3..6); [1,3,11] > members $ m*bag (3..6); [3,5]

As already mentioned, operands of different types can be mixed with the infix operators; the necessary conversions are handled automatically. E.g., here's how you add a set to a bag:

> let m1::Bag = bag [5,1,3,11,3]; > let m2::Set = set (3..6); > members (m1+m2); [1,3,3,3,4,5,5,6,11]

Note that the result will always be promoted to the most general operand type in such cases (a bag in the above example). If this is not what you want, you'll have to apply the necessary conversions manually:

> members (set m1+m2); [1,3,4,5,6,11]

If set members aren't ordered then you'll get an exception when trying to create an ordered set or bag from them:

> set [a,b,c]; <stdin>, line 5: unhandled exception 'failed_cond' while evaluating 'set [a,b,c]'

In such a case hashed sets and bags must be used instead. These work analogously to the ordered sets and bags, but distinct members are stored in an apparently random order:

> members $ hset [a,b,c] * hset [c,d,e]; [c] > members $ hbag [a,b,c] + hbag [c,d,e]; [a,c,c,b,d,e]

This module offers some useful system routines, straight from the C library, as well as some convenience functions for wrapping these up in Pure. Even the "purest" program needs to do some basic I/O every once in a while, and this module provides the necessary stuff to do just that. The operations provided in this module should work (if necessary by a suitable emulation) on all supported systems. Most of the following functions are extensively documented in the C library manual pages, so we concentrate on the Pure-specific aspects here.

To use the operations of this module, add the following import declaration to your program:

using system;

Some functions of the system interface are provided in separate modules; see Additional POSIX Functions and Option Parsing.

`errno`,`set_errno n`,`perror msg`,`strerror n`- This value and the related routines are indispensable to give proper
diagnostics when system calls fail for some reason. Note that, by its
very nature,
`errno`is a fairly volatile value, don't expect it to survive a return to the command line in interactive sessions.

Example:

> using system; > fopen "junk" "r", perror "junk"; junk: No such file or directory fopen "junk" "r"

`setlocale category locale`- Set or retrieve the current locale.

Details are platform-specific, but you can expect that at least the
categories `LC_ALL`, `LC_COLLATE`, `LC_CTYPE`, `LC_MONETARY`,
`LC_NUMERIC` and `LC_TIME` are defined, as well as the following values
for the locale parameter: `"C"` or `"POSIX"` (the default POSIX
locale), `""` (the system default locale), and `NULL`, to just query
the current locale.

Other string values which can be passed as the locale argument depend on
the implementation, please check your local setlocale(3) documentation for
details. If locale is not `NULL`, the current locale is changed
accordingly. The return value is the new locale, or the current locale when
passing `NULL` for the locale parameter. In either case, the string
returned by `setlocale` is such that it can be passed to `setlocale` to
restore the same locale again. In case of an error, `setlocale` fails
(rather than returning a null pointer).

Please note that calling this function alters the Pure interpreter's idea
of what the current locale is. When the interpreter starts up, it always
sets the default system locale. Unless your scripts rely on a specific
encoding, setting the locale to either `"C"` or `""` should always be
safe.

Example:

> setlocale LC_ALL NULL; "en_US.UTF-8"

`trap action sig`- Establish or remove Pure signal handlers.

The action parameter of `trap` can be one of the predefined integer
values `SIG_TRAP`, `SIG_IGN` and `SIG_DFL`. `SIG_TRAP` causes the
given signal to be handled by mapping it to a Pure exception of the form
`signal sig`. `SIG_IGN` ignores the signal, `SIG_DFL` reverts to the
system's default handling. See `show -g SIG*` for a list of known signal
values on your system.

Note: When the interpreter runs interactively, most standard termination
signals (`SIGINT`, `SIGTERM`, etc.) are already set up to report
corresponding Pure exceptions; if this is not desired, you can use `trap`
to either ignore these or revert to the default handlers instead.

See Exception Handling in the Pure Manual for details and examples.

The usual date/time functions from the C library are all provided. This
includes some functions to retrieve wallclock and cpu time which usually
offer much better resolution than the venerable `time` function.

`time`- Reports the current time in seconds since the
**epoch**, 00:00:00 UTC, Jan 1 1970. The result is always a bigint (in fact, the`time`value is already 64 bit on many OSes nowadays).

`gettimeofday`- Returns wallclock time as seconds since the epoch, like
`time`, but theoretically offers resolutions in the microsec range (actual resolutions vary, but are usually in the msec range for contemporary systems). The result is returned as a double value (which also limits precision). This function may actually be implemented through different system calls, depending on what's available on the host OS.

`clock`- Returns the current CPU (not wallclock) time since an arbitrary point in
the past, as a machine int. The number of "ticks" per second is given by
the
`CLOCKS_PER_SEC`constant. Note that this value will wrap around approximately every 72 minutes.

`sleep t`,`nanosleep t`- Suspend execution for a given time interval in seconds.
`sleep`takes integer (int/bigint) arguments only and uses the`sleep()`system function.`nanosleep`also accepts double arguments and theoretically supports resolutions down to 1 nanosecond (again, actual resolutions vary). This function may actually be implemented through different system calls, depending on what's available on the host OS. Both functions usually return zero, unless the sleep was interrupted by a signal, in which case the time remaining to be slept is returned.

Examples:

> time,sleep 1,time; 1270241703L,0,1270241704L > gettimeofday,nanosleep 0.1,gettimeofday; 1270241709.06338,0.0,1270241709.16341

Here's a little macro which lets you time evaluations:

def timex x = y,(t2-t1)/CLOCKS_PER_SEC when t1 = clock; y = x; t2 = clock; end;

Example:

> timex (foldl (+) 0 (1..100000)); 705082704,0.07

The `tzset` function calls the corresponding routine from the C library
and initializes the (Pure) variables `tzname`, `timezone` and
`daylight` accordingly. See the tzset(3) manual page for details. This
routine is also called automatically when the system module is loaded, so
you only have to invoke it to get up-to-date information after changes to
the locale or the timezone. Example:

> tzset; () > tzname, timezone, daylight; ["CET","CEST"],-3600,1 > tzname!daylight; "CEST"

The following functions deal with date/time values in string and "broken-down" time format. See the ctime(3), gmtime(3), localtime(3), mktime(3), asctime(3), strftime(3) and strptime(3) manual pages for details.

`ctime t`- Convert a time value as returned by the
`time`function to a string in local time.

`gmtime t`,`localtime t`- Convert a time value to UTC or local time in "broken-down" form (a static
pointer to a
`tm`struct containing a bunch of`int`fields) which can then be passed to the`asctime`and`strftime`functions, or to`int_matrix`if you want to convert the data to a matrix; see the example below.

`mktime tm`- Converts broken-down time to a time value (seconds since the epoch). As
with
`time`, the result is always a bigint.

`asctime tm`,`strftime format tm`- Format broken-down time as a string.
`strftime`also uses a format string supplied by the user, see below for a list of the most important conversion specifiers.

`strptime s format tm`- Parse a date/time string
`s`according to the given format (using more or less the same format specifiers as the`strftime`function) and store the broken-down time result in the given`tm`struct. This function may fail, e.g., if`strptime`finds an error in the format string. Otherwise it returns the part of the string which wasn't processed, see the example below.

Examples:

> let t = time; t; 1270239790L > let tm = localtime t; tm; #<pointer 0x7ff97ecbdde0> > mktime tm; 1270239790L > asctime tm; "Fri Apr 2 22:23:10 2010\n" > int_matrix 9 tm; {10,23,22,2,3,110,5,91,1} > strftime "%c" tm; "Fri 02 Apr 2010 10:23:10 PM CEST" > strptime ans "%c" tm, int_matrix 9 tm; "CEST",{10,23,22,2,3,110,5,91,1}

In the above example, `strptime` was given a static pointer to a `tm`
struct returned by `localtime`. This always works, but in some situations
it may be preferable to allocate dynamic storage instead. This storage
should be properly initialized (zeroed out) before passing it to
`strptime`, since `strptime` only stores the values specified (at least
in principle; please consult your local C library documentation for
details). Also note that while POSIX only specifies nine `int` fields in
a `tm` struct, depending on the host operating system the struct may
contain additional public and private fields. The actual size of a `tm`
struct is given by the `SIZEOF_TM` constant, so a safe way to allocate
suitable dynamic storage for the `strptime` function is as follows:

> let tm = calloc 1 SIZEOF_TM; > strptime "4/2/10" "%D" tm, int_matrix 9 tm; "",{0,0,0,2,3,110,5,91,0}

Instead of explicitly allocating dynamic storage and converting it to a
Pure matrix later, you can also invoke `strptime` directly with an int
matrix of sufficient size:

> let tm = imatrix (SIZEOF_TM div SIZEOF_INT + 1); > strptime "4/2/10" "%D" tm, take 9 tm; "",{0,0,0,2,3,110,5,91,0}

Last but not least, to make calling `strptime` more convenient, you can
supply your own little wrapper function which takes care of allocating the
storage, e.g.:

mystrptime s format = s,take 9 tm when tm = imatrix (SIZEOF_TM div SIZEOF_INT + 1); s = strptime s format tm; end; > mystrptime "4/2/10" "%D"; "",{0,0,0,2,3,110,5,91,0}

Here is a list of some common format specifiers which can be used with the
`strftime` and `strptime` routines. These are all specified by POSIX
and should thus be available on most platforms. Note that many more formats
are usually supported than what is listed here, so please consult your
local manual pages for the complete list.

`%d`,`%m`,`%y`: Day of the month, month and year as decimal two-digit numbers.`%Y`: The year as a four-digit number which includes the century.`%H`,`%M`,`%S`: Hours (range`00`to`23`), minutes and seconds as decimal two-digit numbers.`%I`: The hours on a 12-hour clock (range`01`to`12`).

The following formats are locale-dependent:

`%a`,`%A`: Abbreviated and full weekday name.`%b`,`%B`: Abbreviated and full month name.`%p`: AM or PM.`%P`is the same in lowercase (`strftime`only).

There are also some useful meta-formats which specify various combinations of the above:

`%c`: The preferred date and time representation for the current locale.`%D`: The American date format (`%m/%d/%y`).`%F`: The ISO 8601 date format (`%Y-%m-%d`). (This is generally supported by`strftime`only, but`strptime`from GNU libc has it.)`%r`: The time in AM/PM notation (`%I:%M:%S %p`).`%R`: The time in 24-hour notation (`%H:%M`).`%T`: The time in 24-hour notation, including seconds (`%H:%M:%S`).

In addition, `%%` denotes a literal `%` character, `%n` newlines and
`%t` tabs. (For `strptime` the latter two are synonymous and match
arbitrary whitespace.)

Windows users should note that `strptime` isn't natively supported there.
A basic emulation is provided by the Pure runtime, but at present this only
supports the C locale.

The following process functions are available on all systems. (Some
additional process-related functions such as `fork`, `kill`, `wait`
and `waitpid` are available in the posix module, see Additional POSIX
Functions.)

`system cmd`- Execute a shell command.

`execv prog argv`,`execvp prog argv`,`execve prog argv envp`- Execute a new process.
`prog`denotes the name of the executable to be run,`argv`the argument vector (which repeats the program name in the first component), and`envp`a vector of environment strings of the form`"var=value"`. The`execv`function executes the program`prog`exactly as given, while`execvp`also performs a path search. The`execve`function is like`execv`, but also specifies an environment to be passed to the process. In either case, the new process replaces the current process. For convenience, both`argv`and`envp`can be specified as Pure string lists, which are automatically translated to the raw,`NULL`-terminated C string vectors (i.e.,`char**`) required by the underlying C functions, using the`byte_cstring_pointer`function (see Low-Level Operations in the String Functions section).

`spawnv mode prog argv`,`spawnvp mode prog argv`,`spawnve mode prog argv envp`- Spawn a new child process. These work like the corresponding MS Windows
functions; on Un*x systems this functionality is implemented using a
combination of
`fork`and`execv`. The arguments are the same as for the`execv`functions, except that there's an additional`mode`argument which specifies how the process is to be executed:`P_WAIT`waits for the process to finish, after which`spawnv`returns with the exit status of the terminated child process;`P_NOWAIT`makes`spawnv`return immediately, returning the process id; and`P_OVERLAY`causes the child process to replace its parent, just like with`execv`. (On Windows, there's an additional`P_DETACH`flag which works like`P_NOWAIT`but also turns the child process into a background task.)

Note that, in addition, the prelude provides the `exit` function which
terminates the program with a given exit code, cf. Other Special
Primitives.

Examples:

> system "pwd"; /home/ag/svn/pure-lang/trunk/pure/lib 0 > spawnvp P_WAIT "pwd" ["pwd"]; /home/ag/svn/pure-lang/trunk/pure/lib 0 > spawnv P_WAIT "/bin/sh" ["/bin/sh","-c","pwd"]; /home/ag/svn/pure-lang/trunk/pure/lib 0

Note that this module also defines the standard I/O streams `stdin`,
`stderr` and `stdout` as variables on startup. These are ready to be
used with the operations described below. Also note that for convenience
some of the following routines are actually Pure wrappers, rather than just
providing the raw C library routines.

`fopen name mode`,`popen cmd mode`- Open a file or a pipe. These take care of closing a file object automagically when it's garbage-collected, and fail (instead of returning a null pointer) in case of error, so that you can provide any desired error handling simply by adding suitable equations.

`fdopen fd mode`- Associates a file object with a given existing file descriptor. Otherwise
works like
`fopen`, so the resulting file is closed automatically when it's garbage-collected.

`freopen path mode fp`- Reopens a file object. The existing file object is closed. Otherwise
works like
`fopen`, so the resulting file is closed automatically when it's garbage-collected.

`fclose fp`,`pclose fp`- Close a file or a pipe.

`tmpfile`- Creates a unique temporary file (opened in
`"w+b"`mode) which gets deleted automatically when it is closed or the file object gets garbage-collected.

`feof fp`,`ferror fp`,`clearerr fp`- Check the end-of-file and error bits.
`clearerr`clears the error bit.

`fileno fp`- Returns the file descriptor associated with the given file.

`fflush fp`- Flushes the given file (or all open files if
`fp`is`NULL`).

`fgets fp`,`gets`- Pure wrappers for the C
`fgets`and`gets`functions which handle the necessary buffering automatically.

`fget fp`- A variation of
`fgets`which slurps in an entire text file at once.

`fputs s fp`,`puts s`- Output a string to the given file or
`stdout`, respectively. These are just the plain C functions. Note that`puts`automatically adds a newline, while`fputs`doesn't. Hmm.

`fread ptr size nmemb fp`,`fwrite ptr size nmemb fp`- Binary read/writes. Here you'll have to manage the buffers yourself. See the corresponding manual pages for details.

`fseek fp offset whence`,`ftell fp`,`rewind fp`- Reposition the file pointer and retrieve its current value. The constants
`SEEK_SET`,`SEEK_CUR`and`SEEK_END`can be used for the`whence`argument of`fseek`. The call`rewind fp`is equivalent to`fseek fp 0 SEEK_SET`(except that the latter also returns a result code). See the corresponding manual pages for details.

`setbuf fp buf`,`setvbuf fp buf mode size`Set the buffering of a file object, given as the first argument. The second argument specifies the buffer, which must be a pointer to suitably allocated memory or

`NULL`. The`mode`argument of`setvbuf`specifies the buffering mode, which may be one of the predefined constants`_IONBF`,`_IOLBF`and`_IOFBF`denoting no buffering, line buffering and full (a.k.a. block) buffering, respectively; the`size`argument denotes the buffer size.For

`setbuf`, the given buffer must be able to hold`BUFSIZ`characters, where`BUFSIZ`is a constant defined by this module.`setbuf fp buf`is actually equivalent to the following call (except that`setvbuf`also returns an integer return value):setvbuf fp buf (if null buf then _IONBF else _IOFBF) BUFSIZ

Please see the setbuf(3) manual page for details.

Examples:

> puts "Hello, world!"; Hello, world! 14 > map fileno [stdin,stdout,stderr]; [0,1,2] > let fp = fopen "/etc/passwd" "r"; > fgets fp; "at:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash\n" > fgets fp; "avahi:x:103:104:User for Avahi:/var/run/avahi-daemon:/bin/false\n" > ftell fp; 121L > rewind fp; () > fgets fp; "at:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash\n" > split "\n" $ fget $ popen "ls *.pure" "r"; ["array.pure","dict.pure","getopt.pure","heap.pure","math.pure", "matrices.pure","prelude.pure","primitives.pure","quasiquote2.pure", "quasiquote.pure","set.pure","strings.pure","system.pure",""]

C-style formatted I/O is provided through the following wrappers for the C
`printf` and `scanf` functions. Our wrapper functions take or return a
tuple of values, and check these against the format specifiers, so they
shouldn't segfault. However, only simple formats derived from `%cdioux`,
`%efg`, `%s` and `%p` are supported right now.

`printf format args`,`fprintf fp format args`- Print a formatted string to
`stdout`or the given file, respectively. Normally, these functions return the result of the underlying C routines (number of characters written, or negative on error). However, in case of an abnormal condition in the wrapper function (error in format string, argument mismatch), they will throw an exception.

`sprintf format args`- Print a formatted string to a buffer and return the result as a string.
Unlike the C routine, this wrapper just returns the string result, or a
null pointer in case of an error; otherwise, the error handling is the
same as with
`printf`and`fprintf`. The implementation actually uses the C routine`snprintf`for safety, and a suitable output buffer is provided automatically.

`scanf format`,`fscanf fp format`- Read formatted input from
`stdin`or the given file, respectively. These normally return a tuple (or singleton) with the converted values. An exception of the form`scanf_error ret`, where`ret`is the tuple of successfully converted values (which may be less than the number of requested input items), is thrown if end-of-file was met or another error occurred while still reading. The handling of other abnormal conditions (e.g., error in format string) is analogous to`printf`et al. Also note that our implementation here doesn't accept any of the length modifiers used by the C routines. Floating point values will*always*be read in double precision, so you just specify`"e"`,`"g"`etc. for these. However, the "assignment suppression" flag`"*"`*is*understood; the corresponding items will not be returned.

`sscanf s format`- This works exactly like
`fscanf`, but input comes from a string (first argument) rather than a file.

Examples:

> do (printf "%s%d\n") [("foo",5),("catch",22)]; foo5 catch22 () > sscanf "foo 5 22" "%s %d %g"; "foo",5,22.0

`stat path`- Return information about the given file. This is a simple wrapper around
the corresponding system call, see the stat(2) manual page for
details. The function returns a tuple with the most important fields from
the
`stat`structure, in this order:`st_dev`,`st_ino`,`st_mode`,`st_nlink`,`st_uid`,`st_gid`,`st_rdev`,`st_size`,`st_atime`,`st_mtime`,`st_ctime`. Among these,`st_mode`,`st_nlink`,`st_uid`and`st_gid`are simple machine integers, the rest is encoded as bigints (even on 32 bit platforms).

`lstat path`- Return information about the given symbolic link (rather than the file
it points to). On systems where this function isn't supported (e.g.,
Windows),
`lstat`is identical to`stat`.

`fstat fp`- Return information about the given file object. Same as
`stat`, but here the file is given as a file pointer created with`fopen`(see Basic I/O Interface above). Note that the corresponding system function actually takes a file descriptor, so the Pure implementation is equivalent to the C call`fstat(fileno(fp))`. This function might not be supported on all platforms.

For average applications, the most interesting fields are `st_mode` and
`st_size`, which can be retrieved with `stat filename!![2,7]`. Note
that to facilitate access to the `st_mode` field, the usual masks and
bits for file types (`S_IFMT`, `S_IFREG`, etc.) and permissions
(`S_ISUID`, `S_ISGID`, `S_IRWXU`, etc.) are defined as constants by
this module. Use the command `show -g S_*` in the interpreter to get a
full list of these. Other interesting fields are `st_atime`, `st_mtime`
and `st_ctime`, which can be accessed using `stat filename!!(8..10)`.
The values of these fields are the times of last access, last modification
and creation, respectively, which can be decoded using the appropriate time
functions like `ctime` or `strftime`, see Time Functions.

Examples:

> stat "/etc/passwd"; 64773L,9726294L,33188,1,0,0,0L,1623L,1250373163L,1242692339L,1242692339L > stat "/etc/passwd"!7; // file size 1623L > strftime "%c" $ localtime $ stat "/etc/passwd"!10; // creation time "Tue 19 May 2009 02:18:59 AM CEST" > sprintf "0%o" $ stat "/etc/passwd"!2 and not S_IFMT; // permissions "0644" > stat "/etc/passwd"!2 and S_IFMT == S_IFREG; // this is a regular file 1 > stat "/etc"!2 and S_IFMT == S_IFDIR; // this is a directory 1

`readdir name`- Read the contents of the given directory and return the names of all its entries as a list.

Example:

> readdir "/home"; ["ag",".",".."]

`fnmatch pat s flags`- Returns a simple truth value (1 if
`pat`matches`s`, 0 if it doesn't), instead of an error code like the C function.

`glob pat flags`- Returns a Pure list with the matches (unless there is an error in which case the integer result code of the underlying C routine is returned).

The available flag values and glob error codes are available as symbolic
`FNM_*` and `GLOB_*` constants defined as variables in the global
environment. See the fnmatch(3) and glob(3) manpages for the meaning of
these.

Example:

> glob "*.pure" 0; ["array.pure","dict.pure","getopt.pure","heap.pure","math.pure", "matrices.pure","prelude.pure","primitives.pure","set.pure", "strings.pure","system.pure"]

The POSIX regex functions (`regcomp` and `regexec`) have a somewhat
difficult calling sequence, hence we provide a couple of rather elaborate
high-level wrapper functions for use in Pure programs. These are
implemented in terms of a low-level interface provided in the runtime. (The
low-level interface isn't documented here, but these functions are also
callable if you want to create your own regular expression engines in Pure.
You might wish to take a look at the implementation of the high-level
functions in system.pure to see how this can be done.)

`regex pat cflags s eflags`- Compiles and matches a regex in one go, and returns the list of submatches (if any).

The arguments are:

`pat::string`, the regular expression pattern;`cflags::int`, the compilation flags (bitwise or of any of the flags accepted by regcomp(3));`s::string`, the subject string to be matched;`eflags::int`, the matching execution flags (bitwise or of any of the flags accepted by regexec(3)).

Symbolic `REG_*` constants are provided for the different flag values,
see the regcomp(3) manpage for an explanation of these. Two particularly
important compilation flags (to be included in the `cflags` argument) are
`REG_NOSUB`, which prevents submatches to be computed, and
`REG_EXTENDED`, which switches `regex` from "basic" to "extended"
regular expressions so that it understands all the regular expression
elements of egrep(1) in the pattern argument.

Depending on the flags and the outcome of the operation, the result of this function can take one of the following forms:

`regerr code msg`: This indicates an error during compilation of the pattern (e.g., if there was a syntax error in the pattern).`code`is the nonzero integer code returned by`regcomp`, and`msg`is the corresponding error message string, as returned by`regerror`. You can redefine the`regerr`function as appropriate for your application (e.g., if you'd like to print an error message or throw an exception).`0`or`1`: Just a truth value indicates whether the pattern matched or not. This will be the form of the result if the`REG_NOSUB`flag was specified for compilation, indicating that no submatch information is to be computed.`0`(indicating no match), or`1`(indicating a successful match), where the latter value is followed by a tuple of`(pos,substr)`pairs for each submatch. This will be the form of the result only if the`REG_NOSUB`flag was*not*specified for compilation, so that submatch information is available.

Note that, according to POSIX semantics, a return value of 1 does *not*
generally mean that the entire subject string was matched, unless you
explicitly tie the pattern to the beginning (`^`) and end (`$`) of the
string.

If the result takes the latter form, each `(pos,substr)` pair indicates a
portion of the subject string which was matched; `pos` is the position at
which the match starts, and `substr` is the substring (starting at
position `pos`) which was matched. The first `(pos,substr)` pair always
indicates which portion of the string was matched by the entire pattern,
the remaining pairs represent submatches for the parenthesized subpatterns
of the pattern, as described on the regcomp(3) manual page. Note that some
submatches may be empty (if they matched the empty string), in which case a
pair `(pos,"")` indicates the (nonnegative) position `pos` where the
subpattern matched the empty string. Other submatches may not participate
in the match at all, in which case the pair `(-1,"")` is returned.

The following helper functions are provided to analyze the result returned
by `regex`.

`reg_result res`- Returns the result of a
`regex`call, i.e., a`regerr`term if compilation failed, and a flag indicating whether the match was successful otherwise.

`reg_info res`- Returns the submatch info if any, otherwise it returns
`()`.

`reg n info`- Returns the
`n`th submatch of the given submatch info, where`info`is the result of a`reg_info`call.

`regs info`- Returns all valid submatches, i.e., the list of all triples
`(n,p,s)`for which`reg n == (p,s)`with`p>=0`.

In addition, the following convenience functions are provided to perform global regex searches, to perform substitutions, and to tokenize a string according to a given delimiter regex.

`regexg f pat cflags s eflags`- Perform a global regular expression search. This routine will scan the
entire string for (non-overlapping) instances of the pattern, applies the
given function
`f`to the`reg_info`for each match, and collects all results in a list. Note: Never specify the`REG_NOSUB`flag with this function, it needs the submatch info.

`regexgg f pat cflags s eflags`- This works like
`regexg`, but allows overlapping matches.

`regsub f pat cflags s eflags`- Replaces all non-overlapping instances of a pattern with a computed
substitution string. To these ends, the given function
`f`is applied to the`reg_info`for each match. The result string is then obtained by concatenating`f info`for all matches, with the unmatched portions of the string in between. To make this work,`f`must always return a string value; otherwise,`regsub`throws a`bad_string_value`exception.

`regsplit pat cflags s eflags`- Splits a string into constituents delimited by substrings matching the given pattern.

Please note that these operations all operate in an eager fashion, i.e., they process the entire input string in one go. This may be unwieldy or at least inefficient for huge amounts of text. As a remedy, the following lazy alternatives are available:

`regexgs f pat cflags s eflags`

`regexggs f pat cflags s eflags`

`regsplits pat cflags s eflags`- These work like
`regexg`,`regexgg`and`regsplit`above, but return a stream result which enables you to process the matches one by one, using "call by need" evaluation.

Let's have a look at some simple examples:

> let pat = "[[:alpha:]][[:alnum:]]*"; > let s = "1var foo 99 BAR $%&";

Simple match:

> regex pat 0 s 0; 1,1,"var"

Same without match info:

> regex pat REG_NOSUB s 0; 1

Global match, return the list of all matches:

> regexg id pat 0 s 0; [(1,"var"),(5,"foo"),(12,"BAR")]

Same with overlapping matches:

> regexgg id pat 0 s 0; [(1,"var"),(2,"ar"),(3,"r"),(5,"foo"),(6,"oo"),(7,"o"),(12,"BAR"), (13,"AR"),(14,"R")]

Note that `id` (the identity function) in the examples above can be
replaced with an arbitrary function which processes the matches. For
instance, if we only want the matched strings instead of the full match
info:

> regexg (!1) pat 0 s 0; ["var","foo","BAR"]

Lazy versions of both `regexg` and `regexgg` are provided which return
the result as a stream instead. These can be processed in a "call by need"
fashion:

> regexgs id pat 0 s 0; (1,"var"):#<thunk 0x7fb1b7976750> > last ans; 12,"BAR"

Let's verify that the processing is really done lazily:

> test x = printf "got: %s\n" (str x) $$ x; > let xs = regexgs test pat 0 s 0; got: 1,"var" > xs!1; got: 5,"foo" 5,"foo" > last xs; got: 12,"BAR" 12,"BAR"

As you can see, the first match is produced immediately, while the remaining matches are processed as the result stream is traversed. This is most useful if you have to deal with bigger amounts of text. By processing the result stream in a piecemeal fashion, you can avoid keeping the entire result list in memory. For instance, compare the following:

> let s2 = fget $ fopen "system.pure" "r"; > stats -m > #regexg id pat 0 s2 0; 7977 0.18s, 55847 cells > #regexgs id pat 0 s2 0; 7977 0.12s, 20 cells

We can also perform substitutions on matches:

> regsub (sprintf "<%d:%s>") pat 0 s 0; "1<1:var> <5:foo> 99 <12:BAR> $%&"

Or split a string using a delimiter pattern (this uses an egrep pattern):

> let delim = "[[:space:]]+"; > regsplit delim REG_EXTENDED s 0; ["1var","foo","99","BAR","$%&"] > regsplit delim REG_EXTENDED "The quick brown fox" 0; ["The","quick","brown","fox"]

The `regsplit` operation also has a lazy variation:

> regsplits "[[:space:]]+" REG_EXTENDED "The quick brown fox" 0; "The":#<thunk 0x7fb1b79775b0> > last ans; "fox"

Empty matches are permitted, too, subject to the constraint that at most one match is reported for each position (which also prevents looping). And of course an empty match will only be reported if nothing else matches. For instance:

> regexg id "" REG_EXTENDED "foo" 0; [(0,""),(1,""),(2,""),(3,"")] > regexg id "o*" REG_EXTENDED "foo" 0; [(0,""),(1,"oo"),(3,"")] > regexgg id "o*" REG_EXTENDED "foo" 0; [(0,""),(1,"oo"),(2,"o"),(3,"")]

This also works when substituting or splitting:

> regsub (cst " ") "" REG_EXTENDED "some text" 0; " s o m e t e x t " > regsub (cst " ") " ?" REG_EXTENDED "some text" 0; " s o m e t e x t " > regsplit "" REG_EXTENDED "some text" 0; ["","s","o","m","e"," ","t","e","x","t",""] > regsplit " ?" REG_EXTENDED "some text" 0; ["","s","o","m","e","","t","e","x","t",""]

Parenthesized subexpressions in a pattern yield corresponding submatch
information, which is useful if we need to retrieve the text matched by a
given subexpression. For instance, suppose we want to parse environment
lines, such as those returned by the shell's `set` command. These can be
dissected using the following regex:

> const env_pat = "^([^=]+)=(.*)$"; > const env_flags = REG_EXTENDED or REG_NEWLINE; > regex env_pat env_flags "SHELL=/bin/sh" 0; 1,0,"SHELL=/bin/sh",0,"SHELL",6,"/bin/sh"

Note that we again used an extended regex here, and we also added the
`REG_NEWLINE` flag so that we properly deal with multiline input. The
desired information is in the 4th and 6th element of the submatch info,
we can retrieve that as follows:

> parse_env s = regexg (\info -> info!3 => info!5) env_pat env_flags s 0; > parse_env "SHELL=/bin/sh\nHOME=/home/bar\n"; ["SHELL"=>"/bin/sh","HOME"=>"/home/bar"]

We can get hold of the real process environment as follows:

> let env = parse_env $ fget $ popen "set" "r"; > #env; 109 > head env; "BASH"=>"/usr/bin/sh"

Just for the fun of it, let's convert this to a record, providing easy random access to the environment variables:

> let env = record env; > env!!["SHELL","HOME"]; {"/bin/bash","/home/ag"}

The posix module provides some additional POSIX functions not available on all supported systems. (In particular, none of these functions are provided on MS Windows.) You can load this module in addition to the system module if you need the additional functionality. To use the operations of this module, add the following import declaration to your program:

using posix;

The following operations are provided. Please see the appropriate POSIX manual pages for a closer description of these functions.

`fork`- Fork a new process.

`getpid`,`getppid`- Get the process id of the current process and its parent process, respectively.

`wait status`,`waitpid pid status options`- Wait for any child process, or the given one. The
`status`argument must be a pointer to an`int`value, which is used to return the status of the child process.

`kill pid sig`- Send the given signal to the given process.

`raise sig`- Raise the given signal in the current process.

This is a quick-and-dirty replacement for the GNU getopt functions, ported from the Q library. To use the operations of this module, add the following import declaration to your program:

using getopt;

The following operation is provided:

`getopt opts args`- Parse options as given by
`opts`in the command line arguments`args`, return the parsed options along with a list of the remaining (non-option) command line arguments.

The `getopt` function takes two arguments: `opts`, a list of option
descriptions in the format described below, and `args`, a list of strings
containing the command line parameters to be parsed for options. The result
is a pair `(opts_return,args_return)` where `opts_return` is a list of
options and their values, and `args_return` is the list of remaining
(non-option) arguments. Options are parsed using the rules of GNU
getopt(1). If an invalid option is encountered (unrecognized option,
missing or extra argument, etc.), `getopt` throws the offending option
string as an exception.

The `opts_return` value is a list of "hash pairs" `opt=>val` where
`opt` is the (long) option name (as given by the `long_opt` field given
in the `opts` argument, see below) and `val` is the corresponding value
(`()` if none). Note that this format is ready to be passed to the
`dict` or `hdict` function, cf. Dictionaries, which makes it easy to
retrieve option values or check for the presence of options. (As of Pure
0.41, you can also convert the list into a vector using the `matrix`
function, and employ the record functions to access the option data,
cf. Record Functions.)

The `opts` argument of `getopt` must be a list of triples `(long_opt,
short_opt, flag)`, where `long_opt` denotes the long option,
`short_opt` the equivalent short option, and `flag` is one of the
symbolic integer values `NOARG`, `OPTARG` and `REQARG` which
specifies whether the option has no argument, an optional argument or a
required argument, respectively. Either `long_opt` or `short_opt`
should be a string value of the form `"--abc"` or `"-x"`,
respectively. Note that since the `long_opt` value is always used to
denote the corresponding option in the `opts_return` list, you always
have to specify a sensible value for that field. If no separate long option
name is needed, you can specify the same value as in the `short_opt`
field, or some other convenient value (e.g., an integer) which designates
the option. Conversely, to indicate that an option has no short option
equivalent, simply specify an empty option string for the `short_opt`
field.

Examples:

> let opts = [("--help", "-h", NOARG), // no argument > ("--version", "", NOARG), // no short option > ("--filename", "-f", REQARG), // required argument > ("--count", "-n", OPTARG)]; // optional argument > getopt opts ["foo", "-h", "--filename", "bar", "-n0", "baz"]; ["--help"=>(),"--filename"=>"bar","--count"=>"0"],["foo","baz"] > catch invalid_option $ getopt opts ["-h","-v"]; invalid_option "-v" > getopt opts [foo, "-h", bar]; ["--help"=>()],[foo,bar]

As the last example shows, non-option arguments (as well as option values specified as separate arguments) can actually be any values which are just copied to the result lists as is.