You are here: Atomic Programs of Joy

TITLE> Atomic Programs of Joy
* Abstract: *
Joy is a functional programming language based on the composition of
functions taking one stack as argument and yielding one stack as
value. Stacks can contain values of simple types such as truth
values, characters and integers, and values of aggregate types such as
sets, character strings and quoted programs with lists as a special
case. The stack functions include the usual operations such as
addition, comparison and list manipulation, but also many new kinds of
functions which dequote quoted programs in various ways. These new
functions behave like higher order functions, such as conditionals,
various forms of recursion, and for aggregates the map, fold and
filter functions. This paper gives an overview of the basic programs
from which others are built by composition and quotation.

Keywords: functional programming, function composition, higher order functions, quotation and dequotation of programs, combinators.

Joy programs denote functions which take *state*s as arguments
and as values. Programs are built from atomic programs which also
denote functions which take states as arguments and as values.
The meaning of compound programs has to be given in terms of the
meanings of atomic programs. It is useful to classify atomic programs
into categories depending on what kind of function they denote. A
coarse classification distinguishes just three, called 1) the
*literal*s, 2) the *operator*s and 3) the
*combinator*s.

Firstly, the *literal* atomic programs are those which look
like constants in conventional languages. They comprise literal
numbers (or, more correctly, numerals) such as integers, and other
literals of type character, string, truth value and set. Literals do
not denote numbers, characters, strings and so on, but they denote
functions which take one state as argument and yield as value another
state which is like the argument state except that the value of the
literal has been pushed onto the stack component.

Secondly, the *operator* atoms are those which look like *
n*-ary operators in other languages. They include the operations
such as for addition and the other arithmetical operations, and for
the various operations on other types. Like all programs, operators
denote functions from states to states, but the functions are not
defined on all states. An * n*-ary operator (such as the
binary addition operator) denotes a function which is defined only on
states whose stack component has * n* items (such as two
integers) on top.

The function yields as value another state which is like the argument
state except that the * n* items on the stack have been
replaced by the result (such as the sum).

Also included as operators are those atoms denoting mere structural
functions of the stack component such as `dup`

,
`swap`

and `pop`

, and those that involve input and
output such as `get`

and `put`

.

Thirdly, the *combinator* atoms are like operators in that they
require the top of the stack to contain certain items. But unlike
operators, they do not treat these items as passive data. Instead
they execute these items - and hence those items must be quoted
programs. So, combinators also denote functions which are defined
only on states having the appropriate number of quoted programs on top
of the stack. They yield as values another state which depends on the
argument state, including the quoted programs, and on the combinator
itself.

Literals, operators and combinators can be concatenated to form
*program*s. These may then be enclosed in square brackets to
form literal *quotation*s. Such literals are not atomic, but
if they occur in a program they are treated just like other literals:
they cause the quoted program to be pushed onto the stack. So,
literal quotations denote functions which take any stack as argument
and yield as value another stack which is like the argument stack
except that it has the quotation pushed on top. Quotations on top of
the stack can be treated like other values, they can be manipulated,
taken apart and combined, but they can also be executed by
combinators. If a quotation contains only literals, then it is a
value of the *list type*. The component literals do not have
to be of the same type, and they may include further quotations. If a
list is executed by a combinator, then its components are pushed onto
the stack.

The remainder of this paper is organised as follows: The next section describes the principal types of Joy and their literals. Following that are four sections on operators, the general ones applicable to all types, then those applicable to simple types such as integers, characters and truth values, then those applicable to aggregate types such as strings, sets and lists, and finally the predicates which return truth values. The following three sections deal with combinators, first those that are independent of any typing, then those specialised to simple types, and finally those specialised to aggregate types.

Some atoms can be applied to any stack and their effect is to push
something on the stack. The items that can be pushed are of various
types. There are *simple type*s such as integers, characters
and truth values. There are also *aggregate type*s such as
strings, sets and quoted programs. Atomic programs which push a
simple or aggregate value onto the stack will be called
*literal*s. A different kind of atom can be applied only to a
non-empty stack. Their effect is to re-organise the top few elements
of the stack.

Some, like `dup`, `swap` and `pop`, just
edit the top few elements. Others expect the top few elements to be
of certain types and their effect is to replace these elements by the
result of applying a function to them as arguments. These include the
arithmetic and relational operators for addition, multiplication and
comparisons, and the truthfunctional operators. They also include
list operations like concatenation. Collectively all of them will be
called *operator*s. A third kind of atoms expect quoted
programs on the top of the stack. Like the operators, they pop the
quoted programs off the stack. But they do not treat them as passive
data structures in the way operators do. Instead they cause the
quoted programs to be executed. These are called
*combinator*s.

After this rough exposition of the classification it is important to suppress any procedural reading and to revert to the official interpretation of Joy programs as denoting unary functions. So the three classes of atoms, namely literals, operators and combinators, all denote unary functions from and to states which include a stack as a principal component. The remainder of this section will deal with literals.

First, the literals of the *integer type*.
The following is the offical semantics of those atoms
that look like numerals:

A digit string such as "The semantics for the`123`

" denotes not anumberbut afunctionfrom states to states. For any state S1 as argument this function yields as its value another state S2 which is like S1 except that its stack component has an additional item, thenumber123 pushed onto it.

`true`

or `false`

has been pushed onto the stack. Literals of the
`'A`

denotes a function taking any
state into another state with that character pushed onto the stack.
The three types of truth values, characters and integers constitute what will be called simple types. They are simple in that their values do not have parts. There are also aggregate types which do have parts. The parts can be extracted by suitable operators and the aggregates can be constructed from their parts. Joy has three different aggregate types: sets of small numbers, strings of characters, and quoted programs, which have lists as a special case.

The *set type* comprises the usual unordered collections
familiar from set theory. The elements of a set are written inside
curly braces, such as `{1 3 5}`

. The whole is a literal
atom, and it denotes a function pushing that set onto the stack. For
most implementations on current machines the elements of sets will be
small numbers in the range `0`

.. `31`

, The
*string type* of character strings constitutes another
aggregate type. Its literals are written as zero or more characters
enclosed in double quotes, such as `"Hello"`

. Such a
literal again denotes a function, one which pushes that string.

The third aggregate type is that of *quoted program*s, or
briefly, *quotation*s. Its literals are written inside square
brackets. A program consists of zero or more literals, operators or
combinators. Enclosing it in square brackets turns it into a quoted
program. Quotations denote functions which push the quoted program;
the quoted program is not executed, it is pushed onto the stack in
"suspended animation". The following are quotations:

[1 2 3] ['A 'B "CDE" {10 11 12}] [pop dup *] [[[]]] [peter paul mary] ["" {} [] [hello "Hello"]A value of the

The following concern connections between quotations and the stack. The stack is normally a sequence of values of various types. This sequence is just a special list which is modified by programs. Since it is a list, it should be possible to put this list on top of the stack - that is to say, on top of itself. Also, it should be possible to make the list on top of the stack become the stack. Finally, it should be possible to create a new, empty stack. There are three operators that do just that:

stack unstack newstackThe

`unstack`

operator undoes what the `stack`

operator does, but the reverse is true only in special cases. The
`stack`

operator pushes a quotation onto the stack,
and the `unstack`

operator expects a quotation on the stack
and makes that the new stack.
It is sometimes useful to treat several types together. The
*numeric* types are integers and characters, the
*Boolean* types are truth values and sets, and the
*sequence* types are strings and lists. A *leaf* is
anything which is not a list, and a *tree* is either a leaf or
a (possibly empty) list of trees.

This completes the brief survey of the six principal types and their literals. Other types might be included in more elaborate versions of Joy. Obvious simple types to add are real (and perhaps complex) numbers, and an enumeration type as in Pascal. It is less clear what new aggregate types are useful since lists already are so versatile. Records and arrays are certainly possible. Only files will be considered briefly below.

First, the following unary operators are defined on all stacks containing at least one element:

pop dupThe top element does not have to satisfy any particular condition, it can be of any type. The

The following binary operators are defined on all stacks containing at least two elements:

swap popd popop dupdThe

The following ternary operators are defined for all stacks containing at least three elements:

swapd rollup rolldownThe

There is another *ternary operator*:

choiceThe

`X`

, `Y`

and `Z`

, with
`Z`

on top. The third value from the top, `X`

,
has to be a truth value. If it is `true`

, then the
`choice`

operator just leaves `Y`

on top of the
stack, and `X`

and `Z`

disappear. On the other
hand, if `X`

is false, then the `choice`

operator
just leaves `Z`

on top of the stack, and `X`

and
`Y`

disappear. This operator is related to two combinators
`ifte`

and `branch`

which are explained in the
next sections.
There is another operator for multi-choices. It expects a non-empty list of non-empty lists on top of the stack and below that one further item.

opcaseThe

`first`

members of the lists. When a match is found, the
`rest`

of that list is pushed onto the stack. If no match
is found, then the last list is used as the default.
The following two operators handle input and output:

put getThe

+ - * / % max minThe following unary operators are defined for all numeric types.

succ pred abs signThe

`-1`

, `0`

or `+1`

depending on
whether the parameter is negative, zero or positive.
The following mathematical functions are provided:

fact exp fib nfib gcdThe unary

The type of truth values is one of the Boolean types. The operators are

and or xor notThe three binary operators

first second third restThe

The following binary operators require an aggregate and a potential member on top of the stack:

cons swonsThe

`swons`

is equivalent to the composition
`swap cons`

, and hence its name. The two operators are
The following unary operators also require a non-empty aggregate on top of the stack:

uncons unswonsThe

`cons`

and `swons`

.
There are two operators for *index*ing in various ways:

at of drop takeThese four binary operators expect an aggregate and a number. That number is used for indexing into the aggregate. The

`drop`

operator
returns an aggragate like `take`

operator returns an aggregate like
The following are some general operators for aggregates:

size reverse concat swoncat zip flatten transposeThe unary

`swap`

first. The binary
The type of sets is another of the Boolean types. The operators are

and or xor notThe three binary operators

`{0..31}`

.
The following operators on sequences deal with ordering of their elements:

qsort qsort1 mergeThe unary

`concat`

operator in that it produces a single sequence.
The difference is that it picks elements from the two sequences in
accordance with their order. If the two sequences were sorted, then
the result of merging them is also sorted.
The following are arithmetic operations for lists of numbers:

sum product scalarproductThe first two expect a list of numbers, the

The following unary operators expect an aggregate on top of the stack and leave a list of various subaggregates on top of the stack:

frontlist restlist powerlist subseqlist permlistLet the size of the aggregate be (N). The

`frontlist`

operator returns a list, beginning with the
empty aggregate, obtained by successively adding the last, second last
... first member of the original aggregate. The `restlist`

operator returns a list, beginning with the original aggregate, by
successively removing the first, second ... last member of the
original aggregate. The A related binary operator is

insertlistThe

A related binary operator for finding the *cartesian product* is

cartproductwhich expects two aggregates that do not have to be of the same type. The

The following unary operators expect a tree:

treeflatten treestrip treereverse treesizeThe

odd even positive negativeThe

`true`

or `false`

just in case the parameter is
odd or even. The `true`

or `false`

just in case
the parameter is positive or negative -- note that truth values and
characters are never negative.
The following binary predicates are defined for all numeric types, they have their usual meaning:

= != < <= > >=The following are two unary predicates defined for all types:

null smallThe

The following binary predicates test aggregates for members:

in hasThe

The following binary predicate is defined for two aggregates of the same kind:

equalThe

`equal`

predicate is true if the two aggregates have the
same members. For strings and lists this means same members in the
same positions. For lists this means recursive equality.
Sometimes it is necessary to test a parameter for its type. This is done by the following unary predicates:

logical char integer set string list leafThe predicates

Sometimes it is useful to operate on quoted predicates to obtain another quoted predicate. There are three such operators:

conjoin disjoin negateThe two operators

Combinators can be classified in many ways: in terms of the number of
expected quotations, in terms of the total number of expected
parameters, quotations and others, in terms of their behaviour, and so
on. To fix the terminology, combinators will be called *
unary*, * binary*, * ternary* and so on, if they
expect one or two or three quotations, and so on. But note that many
combinators expect further parameters below the quotations which they
will execute. The following are some simple *unary
combinator*s which require the top of the stack to be *
one* quotation.

i x y

The `i` combinator pops the quotation off the stack and
executes it, effectively by *dequoting*. The `x`
combinator leaves the quotation on the stack and executes it.
Consequently the `x`

combinator will be executing on a stack
which has as its top element the very same quotation which it is
currently executing. The `y` combinator first converts the
quotation `[P]`

into a different quotation `[Q]`

with the following strange property: if `[Q]`

is ever called
by some combinator, then it builds a copy of itself on top of the
stack and then executes the `[P]`

-part of itself. After
this conversion, the `y`

combinator calls the
`[Q]`

it has constructed. In this way the `y`

combinator builds some of the behaviour of the `x`

combinator into the `[Q]`

.

Another unary combinator is

nullaryNo matter how many parameters the quotation consumes from the stack when

The next unary combinators allow manipulation of the stack below the top few elements:

dip dipd dipddThe

`X`

to be below the quotation. It removes the quotation and
`X`

, saves `X`

somewhere, executes the quotation
on the remainder of the stack, and finally restores `X`

.
The Three further unary combinators are

app1 app2 app3Apart from the quotation which they expect on top of the stack, they require one or two or three further elements on the stack. So the

`app2`

combinator requires two further elements, say
`X`

and `Y`

. In this case the quotation will be
executed twice, once with `X`

on top of the stack and once
with `Y`

on top of the stack. The executions could be done
in any order, even concurrently, provided there are no side effects.
If both executions terminate, both should leave behind a non-empty
stack with respectively `X'`

and `Y'`

on top.
These two values, in their order, are then pushed onto the stack in
place of `X`

and `Y`

. The two other combinators
`app1`

and `app3`

behave analogously: The
`app1`

combinator causes just one execution of the
quotation, and it replaces `X`

by `X'`

. The
`app3`

combinator cases three executions of the quotation,
and it replaces `X`

, `Y`

and `Z`

by
`X'`

, `Y'`

and `Z'`

, maintaining the
order.

The *binary combinator*s expect two quotations on top of the
stack.

b cleaveThe

`[P]`

and
`[Q]`

, with `[Q]`

on top. It removes the two
quotations and executes first `[P]`

and then
`[Q]`

. The `X`

. It also first
executes `[P]`

, with `X`

on top, and then saves
the top result element, say `P(X)`

. Then it executes
`[Q]`

, again with `X`

, and saves the top result as
`Q(X)`

. Finally it restores the stack to what it was below
`X`

and pushes the two results `P(X)`

and
`Q(X)`

.
The *ternary combinator*s expect three quotations on top of the
stack. One of the most important is

ifteThe

`true`

the then-part is executed,
otherwise the else-part is executed.
There are two combinators for doing simple looping:

whiledo tailrecThe binary

`ifte`

combinator in that it has a test, the while-part,
which is second on the stack. The combinator repeatedly executes the
while-part and while that yields `true`

it executes the
other part, the do-part. The ternary
The *quaternary combinator*s expect four quotations on top of
the stack.

linrec binrec genrecThe

`ifte`

combinator it executes the if-part, and if that
yields true it executes the else-part. Otherwise it executes the
else1-part, then it recurses with all four parts, and finally it
executes the else2-part. The It differs from the other two combinators in that after the execution of the else1-part nothing in particular is executed, but a program consisting of the four parts and the combinator is pushed onto the stack. The else2-part thus has it available as a parameter.

For linear recursion the if-part often is `[null]`

and the
else1-part often is either `[pred]`

for numeric types or
`[uncons]`

for aggregate types. The two parts are built
into

primrecfor

There are several combinators which do not have a fixed number of quotation parameters. Instead they use a list of quotations. They are

cond condlinrecThe

`ifte`

combinator. It expects a
non-empty list of programs, each consisting of a quoted if-part
followed by a then-part. The various if-parts are executed until one
is found that returns `true`

, and then its corresponding
then-part is executed.
The last program in the list is the default which is executed if none
of the if-parts yield `true`

. The `condlinrec`
combinator is similar, it expects a list of pairs or triples of quoted
programs. Pairs consist of an if-part and a then1-part, and triples
have an additional then2-part. Again the first if-part that yields
`true`

selects its corresponding then1-part for execution.
If there is a then2-part, the combinator first recurses and then
executes the then2-part. The last program is the default, it does not
have an if-part.

The `cleave`

combinator also has a generalisation:

constructexpects two parameters, a quotation and above that a list of quotations. Each quotation in the list will produce a value that will eventually be pushed onto the stack, and the first quotation determines the stack onto which these values will be pushed.

The following binary combinator expects a truth value below its two quotation parameters:

branchThe

`choice`

operator and the `ifte`

combinator. The truth value below
the two quotations determines which of the two quotations will be
executed. If the truth value is `true`

, then the if-part,
the second parameter, is executed, otherwise the then-part, the top
parameter, is executed.
The following unary combinator expects a numeric value below its quotation parameter:

timesThe

The stack is just a list, so any list could serve as the stack, including a list which happens to be on top of the stack. The following unary combinator expects a list below its quotation parameter:

infraThe

The following unary combinator expects an aggregate below its quotation parameter:

stepThe

There is a related combinator for stepping through two aggregates:

step2The

The following combinators for aggregates are mostly familiar from list processing languages:

map fold filter splitAll four step through the members of the aggregate in the same manner as the

`step`

combinator. The
The `filter` combinator needs a quotation which computes a
truth value, so it is a test. It applies the test to each member of
the aggregate and creates a new aggregate containing those members of
the original which pass the test. The resulting aggregate is of the
same types as the parameter. The `split` combinator only
makes sense in a language in which a function can return two values.

It is like the `filter`

combinator except that it returns
two aggregates - one containing the members of the original which did
not pass the test, and above that another containing those which did
pass the test. The resulting aggregates are of the same type as the
parameter. In both result aggregates the ordering of the original
aggregate is preserved in case they are strings or lists.

The following unary combinators expect an aggregate below their quotation parameter which must compute a truth value:

some allThe

`true`

if some
members of the aggregate pass the test of the quotation, otherwise it
returns `false`

. The `true`

if all members of the aggregate pass the test of the
quotation, otherwise it returns `false`

. For empty
aggregates `some`

returns `false`

and
`all`

returns `true`

.
The following unary combinator expects two aggregates and above that a program suitable for combining their respective elements:

zipwithThe

The following unary combinators expect a program and below that a tree:

treestep treemap treefilter treefoldThey all resemble corresponding combinators for aggregates. The

`step`

handles members of an
aggregate. The
There are two tree combinators which are similar to the
`genrec`

combinator:

treerec treerecgenand above that two quotations,

`[O]`

and `[C]`

.
If the tree is a leaf, then `[O]`

is executed, typically an
operation on a leaf. If the tree is not a leaf, then then combinator
`[C]`

is executed, and it will find on top of the stack the
program `[[O] [C] treerec]`

. The slightly more general
`[O1]`

, `[O2]`

and
`[C]`

. If the tree is a leaf, then `[O1]`

is
executed. If it is not a leaf, then first `[O2]`

is
executed, and then the combinator `[C]`

is executed which
will find `[[O1] [O2] [C] treerecgen]`

on top of the stack.