Algebraic Specification for Input-Output in Abstract Data Types

: Abstract Data Types (ADT) are used when creating software systems, in the systems design. Usually we use algebraic specification to specify the operations in a data type. The use of data types is a methodology or style of working which yields improved design when followed. In this paper we study the addition of input-output operations to the algebraic specification of operations over a data type. The motivation is that input-output operations are used in actual implementations. A specification with input-output is more complete than one without it. We need input-output operations in our programs. This justifies the addition of such operations to the specification. We consider the definition of input-output operations in functional programming in particular in Haskell. Our input-output specifications are not exactly equal to Haskell programs although some of them are likely. We specify input-output operations in a form likely to the specification of the other operations. The result is the algebraic specification of the input-output operations for many frequently used data types. The language considered is sufficiently expressive to model all these operations. The technique is illustrated by means of a variety of examples. We started from sequences, continued with sets and finish with dictionaries. The specifications we present in this paper can be used as specifications of methods of ADT definitions in object oriented programing. Abstract Data Type to refer to a class of objects defined by a representation independent specification.


Introduction
Abstract Data Types (ADT) are used when creating software systems, in the systems design. The idea is to specify operations in data which are independent of the implementation.
Usually algebraic specification is used to specify the operations in a data type. This notation is likely to functional programming.
It is said that is more easy to write correct functional programs than imperative programs because the notation is more likely to mathematics. The idea is to use algebraic specification which is likely to functional languages as a specification language.
The use of data types is a methodology or style of working which yields improved design when followed. It is useful to consider a collection of operations at design time and then specify them in increasingly greater levels of detail until achieving an executable implementation [8][9][10]15].
A possibility is to write first the algebraic specification (without worrying about the efficiency), continue with an implementation with recursive functions and possibly finish the implementation with iterative functions or procedures to have a more efficient implementation.
In this paper we study the addition of input-output operations to the algebraic specification of operations over a data type.
The motivation is that input-output operations are used in actual implementations. We specify input-output operations in a form likely to the specification of the other operations.
There are two chief concerns in devising a technique for specification: to define a notation which allows a rigorous definition of operations being representation independent and to learn to use such notation [8].
A good data type specification should give enough information to define the type, but without limiting the possible implementations. Algebraic specification is appropiate for data types design, since it meets this criteria.
Algebraic Specification separates the relevant detail of what from the irrelevant detail of how. We use the term Abstract Data Type to refer to a class of objects defined by a representation independent specification.
We supply the functionality of the operations giving: name of operation, domain and range. But to rely on one's intuition about the meaning of names is not enough. There are isomorphic functionalities, as by example between Stacks and Queues. We need to specify the semantics of the operations of the type to distinguish them.
Then an algebraic specification of an abstract type consists of a syntactic specification providing names, domains and ranges and a semantic specification consisting on a set of equations defining the meaning of the operations by stating their relationship to one another.
Dealing with input and output in TADs as in functional languages (by example Haskell [16][17]) has as a problem that requires side effects. Mathematical functions always have to return the same results for the same arguments. Any IO library should provide operations to read and write basic types like Integers, Strings, etc. We want this operations to be functions but they are not. An operation that reads a String from the keyboard cannot be a function since it will not return the same String every time.
We can not think of things like "read a String from the keyboard" or "print an Integer" as functions in the pure mathematical sense. We give to them other name (used in Haskell): Actions. And they have special types (in our case the following): input_String: input  String output_integer: Integer  output We put actions together writting them between {, } and ending each in a semicolon. Instead of <-used in Haskell we use let.
Our input-output specifications are not exactly equal to Haskell programs although some of them are likely. They are equations that must be satisfied and are not necessarily applied to constructors although sometimes they are. In [18] is summarized what abstraction means. Some of the names used for this concept are: Abstraction, Modularity, Encapsulation, Information Hiding, Separation of Concerns. As they say in the early days of computing, a programming language came with built-in types (as integers, booleans, strings, etc.), built in procedures for input-output and users could define their own procedures. A major advance in software development was the idea of abstract types: that one could design a programming language to allow user-defined types too. This idea came out of the work of many researchers, notably Dahl, Dijkstra, Hoare, Parnas, Liskov and Guttag [1][2][3][4][5][6][7][8][9][10][14][15]. The key idea of data abstraction is that a type is characterized by the operations you can perform on it. What made abstract types new and different was the focus on operations, the user of the type would not need to worry about how its values were actually stored, all that matters are the operations. Critically, a good abstract data type should be representation independent. Changes in representation have no effect on code outside the abstract type itself. For example, the operations offered by Stack are independent of whether it is represented as a linked list or as an array.
The rest of the paper is organized as follows: in section 2 we present the constructions we use in the specifications. In section 3 the specification of sequences, in section 4 the specification of sets and in section 5 the specification of dictionaries. Section 6 present the Conclusion and we finish in section 7 suggesting further work..

Algebraic Specification and Input-Output
The constructions we allow to use in our algebraic specification are the following: 1. parameters 2. if-then-else 3. let in expressions 4. boolean expressions 5. recursion 6. sequencing of instructions inside {} 7. return The constructions 1, 2, 4 and 5 were used in [8]. Besides if-then-else we use if-then. We add to them a let operation whose use is like in functional programs and sequencing that is used in some input and in the output specifications. Finally we have a return operation that finishes a function.
We consider the specification of two kinds of input operations: 1) we read a specific number of elements 2) we read until the sentinel EOF is at the input For the output, we traverse the ADT printing the elements until reaching an empty data-type.
We will specify three kinds of Collections: Sequences, Sets and Dictionaries. Inside Sequences we consider Stacks and Queues. As Dictionaries we consider Binary Search Trees and Closed Hash tables.
Sequencing of operations is used like in Haskell [16][17] where input-output operations are defined by sequences of actions. In Haskell there are sequences of statements introduced by the do notation. We skip the do keyword. Besides input-output operations we make use of other operations proper of the ADT we are considering. We define sequencing putting the operations inside brackets and finishing by;.
We classify the operations we specify in: 1) constructors 2) observers 3) selectors 4) extenders Constructors are used to define elements in the ADT, observers return a boolean, selectors return parts of the objects we are considering, extenders are the other operations.
In what follows we can see the operations input_item and output_item as polymorphic functions where item is a type variable or as a monomorphic function where item is a concrete type. More about polymorphism and monomorphism in [13].

Specification of Sequences
A sequence is an ordered collection of elements. There is a first element, a second, etc. and each of them occupies a definit position in the sequence. Inside this family there are different ADT's. We will study the specification of inputoutput for the following sequences: Stacks and Queues.

Specification of Stacks
A stack is a special kind of sequence in which all insertions and deletions take place at one end, called the top. Other name for a stack is LIFO or last-in-first-out.
Let us begin with the algebraic specification of Stacks without input-output. From now on, we use sometimes pattern matching when specifying the semantics of operations.
ADT Stack(item) Syntax: constructors: NewStack: Pop(Push(s,i))=s Top(t) precondition not(IsNewStack(t)) Top(Push(s,i))=i We indicate preconditions that must hold before applying an operation. We specify the operations only when holds the precondition. An alternative followed in [8] is to have a value undefined that is returned when an operation is applied to a value that does not fit the precondition, by example Top(NewStack).
We consider the following kind of operations: constructors, whose semantics is primitive and is not given explicitly but by the application of other operations to elements constructed by their application. Observers that give boolean values when applied to elements constructed by application of constructors and selectors that return the parts from which are constructed the elements.

Specification of Input in Stacks
Input operations are applied to (). Usually input operations read a specific number of values or until a sentinel is at the input. We assume the input-output operations of primitive types like Integers, Char, Bool, Strings are given.
Consider first the case in which we read a predefined number of elements. We use functions input_int and input_item to read an integer and an item respectively. We specify the input by Read_Stack: input Another possibility is to read values until a sentinel is read. The specification in this case is

Specification of Output in Stacks
The case of the output uses sequencing. We print the elements of the stack from top to bottom. output_item: item output  Print_Stack: Stack output  Print_Stack(s) = if (not(IsNewStack(s))) then {output_item(Top(s)); Print_Stack(Pop(s))} Print_Stack can be specified also by pattern matching in the constructors. In the empty case we use the return operation.

Specification of Queues
A queue is another special kind of sequence, where items are inserted at one end (the rear) and deleted at the other end (the front). Another name for a queue is FIFO or first-in-firstout.
Let us begin with the algebraic specification of Queues without input-output.
ADT Append(q,NewQueue)=q Append(q,Add_at_back(s,i))=Add_at_back(Append(q,s),i) We have added a new kind of operation: an extender. In this example we define Append which returns a Queue compossed from another two.

Specification of Input in Queues
We consider the same two cases that in the case of Stacks, i.e. to read an specified number of elements and to read until is input a sentinel.
Consider first the case in which we read a predefined number of elements. We use functions input_int and input_item as before. We specify the input by Read_Queue: input The specification of the case in which we read values until a lookout is:

Specification of Output in Queues
In the case of the output of a queue we print the elements from left to right using sequencing. output_item: item output  Print_Queue: Queue output  Print_Queue(s) = if (not(IsNewQueue(s))) let i=Front(s) in {output_item(i); Print_Queue(Delete(s));}

Specification of Input in Sets
Consider first the case in which we read a predefined number of elements. We use functions input_int and input_item as before. We specify the input by  Read_Set_Value(s,0)=s Read_Set_Value(s,x+1) = let j=input_item() in if Belongs(s,j) then Read_Set_Value(s,x+1); else Read_Set_Value(Add(s,j),x) We consider when is input an element that already belongs to the set. In this case the element is not added again and we don't decrease the number of elements to be read.
The specification of the case in which we read values until a lookout is: Read_Set: input Set  Read_Set() = Read_Set_Value(NewSet) Read_Set_Value: Set Set  Read_Set_Value(s) = let i=input_item() in if (i==EOF) then s else if Belongs(s,i) then Read_Set_Value(s) else Read_Set_Value(Add(s,i))

Specification of Dictionaries
This family defines collections whose elements have an attribute that is a key that identify them. We will study the specification of input-output for the following dictionaries: Binary Search Trees (BST) and Closed Hash tables.

Specification of Binary Search Trees
A binary search tree is a dictionary whose elements are ordered by some linear order. Is a binary tree in which all the elements in the left subtree of a node are smaller that the element at the node and all the elements in the right subtree of a node are greater that the element at the node. We call this the search property.
Let us begin with the algebraic specification of Binary Search Trees without input-output. We have an extender AddElem that adds an element to a binary search tree if the element does not belong to the tree in which case the element is not added. The addition of the elements satisfies the search property. BST's are constructed applying NewBST that gives an empty BST and Add that given two BST and an element returns a BST.

Specification of Input in BST
Consider first the case in which we read a predefined number of elements. We specify the input by Read_BST_Value(s,0,0) = s Read_BST_Value(s,0,k) = Read_BST_Value(s,k,0) Read_BST_Value(s,x+1,k) = let i=input_item() in {if Member(s,Key i)) then Read_BST_Value(s,x+1,k+1); else Read_BST_Value(AddElem(s,i),x,k);} We sum in parameter k the number of repetitions in the input to read after again this number of elements. We repeat this process until there are not more repeated elements at the input. We use sequencing to read x elements as part of the input of the x+1 elements.
The specification of the case in which we read values until a lookout is: Read_BST: input BST  Read_BST() = Read_BST_Value(NewBST) Read_BST_Value: BST BST  Read_BST_Value(s) = let i=input_item() in if (Key i==EOF) then s else if (Member(s,Key i)) then Read_BST_Value(s) else Read_BST_Value (AddElem(s,i)) in this case, if we read an element that is already in the tree it is not added again.

Specification of Output in BST
In the case of the output of a BST we print the elements in inorder.

Specification of Closed Hash Tables
When an application needs to store information we convert the key in an index that indicates the position at which is stored the information. If we want to add an element and the position is occupied we search sequentially the first position free. In the same way we search an element when we want to know if is in the table, to delete it or to modify it.
Let us begin with the algebraic specification of Closed Hash.
The operations Select, FHash, SetFree and Key depend on the representation and we don't give definitions for they. ADT

Specification of Input in Closed Hash Tables
Consider first the case in which we read a predefined number of elements. We use functions input_int and input_item as before. We will specify three ways of adding elements: as in the case of Sets, as in the case of BST and until a lookout. Hash  Read_Hash_Value(s) = let i=input_item() in if (i==EOF) then s else if (Member(s,Key(i))) then Read_Hash_Value(s) else Read_Hash_Value(Add(s,i))

Specification of Output in Hash Tables
The operation Set_of_keys below constructs from a Hash Output_using_key: Set Hash output   Output_using_key(s,h) = if (not(IsNewSet(s))) then let s=Add(s',k) in {output_item(Element(h,k)); Output_using_key(s',h);} Output_using_key print the elements whose key is in the set.

Conclusions
Algebraic specification supports input-output operations in a functional framework. We model input-output as well as the other operations usually considered in algebraic specification of ADTs. The language considered is sufficiently expressive to model all these operations. The technique is illustrated by means of a variety of examples. We started from sequences and continued with sets and dictionaries. The specifications we present in this paper can be used as specifications of methods of ADT definitions in object oriented programming.

Further Work
Remains to study if holds the completeness of the Algebraic Specification of input-output [11][12].