HaLeX: A Haskell Library to Model, Manipulate and Animate Regular Languages
Joćo Saraiva
Departamento de Informįtica
Universidade do Minho
Portugal
What is HaLeX?
The Library
Drawing Finite Automata
Reactive Finite Automata
HaLeX in Education
Where to get the software
Documentation
Publications

The Haskell library HaLeX enables us to model, manipulate and animate regular languages. This library was developed in the context of a programming methodology course for undergraduate students, and as a consequence, it was defined mainly for educational purposes. Indeed, it provides a clear, efficient and concise way to define, to understand and to manipulate regular languages in Haskell. Furthermore, the construction of the complete library has been proposed as assignment projects to the students following the course. HaLeX is now being used to support this course.

1- What is HaLeX?

The HaLeX library introduces a number of Haskell datatypes and the respective functions that manipulate them, providing the following facilities:
  • The definition of deterministic finite automata, non-deterministic finite automata, and regular expressions directly and straightforwardly in Haskell.
  • The definition of acceptance functions for all those models.
  • The transformation from regular expressions into non-deterministic finite automata (NDFA) and from NDFA into deterministic finite automata (DFA).
  • The minimization of the number of states of deterministic finite automata.
  • The equivalence of regular expressions and finite automata.
  • The graphical representation of finite automata.
  • The definition of Lex-like facilities, for example, from a regular expression it generates an efficient recognizer for the language: a Haskell program based on the equivalent minimized DFA.
  • The definition of reactive finite automata. In order to allow that an automaton reacts while accepting an input sentence, the library defines a monadic finite automaton. The reactions are defined straightforwardly as monadic functions, that are easily embedded in the automaton definition.
  • The simplification of regular expressions (to be included soon: this year students project)
  • The automatic animation of the acceptance function of finite automata.
The HaLeX library is now being used in the HaGLr parser generator . The HaGLr produces top-down and bottom-up parsers that use several parsing techniques, such as LL(1), SLR(1) and the powerful Generalized LR parsing technology. HaLeX is used to construct, manipulate and visualize the underlying LR(0) automata for bottom-up parsers.

2- The Library



2.1- Regular Expressions

Regular expressions are modelled in Haskell by the abstract data type RegExp , parameterized with the type of the symbols. We define one data constructor for each "standard" regular expression operator.

data RegExp sy = Empty
| Epsilon
| Literal sy
| Or (RegExp sy) (RegExp sy)
| Then (RegExp sy) (RegExp sy)
| Star (RegExp sy)
| OneOrMore (RegExp sy)
| Optional (RegExp sy)

and we can write:
a = Literal 'a'
aOrb = ( Literal 'a') ` Or ` ( Literal 'b')



2.2- Deterministic Finite Automata

Formally, a Deterministic Finite Automaton (DFA) is a five-tuple A = (V,Q,S,Z,D) . In order to make our code more readable, we prefer to define a new data type, to express the automata's five-tuple, named Dfa with a single constructor, named Dfa as well, rather than using the Haskell built-in tuples. To make our DFA's more general, we parameterize the data type with the types of states and symbols

data Dfa st sy = Dfa [ sy ] -- Vocabulary
[ st ] -- Finite set of states
st -- The start state
[ st ] -- The set of final states
(st - > sy - > st) -- The transition function

where sets are modelled through the Haskell built-in lists.

Now, We can define DFA easily in Haskell as follows:

ex1 :: Dfa Char Char
ex1 = Dfa [ 'a','b' ]
[ 'A','B','C','D' ]
'A'
[ 'C' ]
delta
where delta 'A' 'a' = 'A'
delta 'A' 'b' = 'B'
delta 'B' 'a' = 'C'
delta _ _ = 'D'



2.3- Non-Deterministic Finite Automata

We define non-deterministic finite automata with epsilon transitions and several initial states.

To epsilon transitions we use the pre-defined datatype Maybe, where the constructor Nothing models epsilon-transitions, while the constructor Just models transitions labeled by its argument symbol.

data Ndfa st sy = Ndfa [ sy ] -- Vocabulary
[ st ] -- Finite set of states
[ st ] -- The set of start states
[ st ] -- The set of final states
(st - > Maybe sy - > [ st ] ) -- The transition function

Next, we show an example of a NDFA.

ex1 :: Ndfa Char Char
ex1 = Ndfa [ 'a','b' ]
[ 'A','B','C' ]
[ 'A' ]
[ 'C' ]
delta
where delta 'A' ( Just 'a') = [ 'A','B' ]
delta 'A' ( Just 'b') = [ 'B' ]
delta 'B' ( Just 'a') = [ 'C' ]
delta 'C' Nothing = [ 'A','B' ]
delta _ _ = [ ]


3- Drawing Finite Automata

Finite automata are more comprehensible in a graphical representation. The following representation is used by HaLeX : states are depicted as the nodes of a graph, final states get a red double circle, and the initial states are explicitly mentioned by using a green circle. The transition function induces arrows connecting the nodes of the graph: for each transition o to d thought symbol s there is an arrow labeled by s from node o to d .

Drawing graphs is a complex task and an intensive area of research. Thus, instead of defining a graph visualization system from scratch, and reinventing the wheel, HaLeX synthesizes a graph representation for an existing high quality graph visualization system: the GraphViz system , developed and available at AT\&T Labs .

GraphViz provides a collection of tools for manipulating graph structures and generating graph layouts. The input is a description of the graph in the Dot language. Next, we present an example of the input Dot text (left) and the outputed graph as displayed by one of GraphViz tools, the dot tool (right).

digraph HaLeX {
rankdir = LR ;
"1" [ shape = circle , color = green ] ;
"2" [ shape = doublecircle , color = red ] ;
"3" [ shape = doublecircle , color = red ] ;
node [ shape = circle , color = black ] ;
"1" - > "2" [ label = "'a'" ] ;
"1" - > "3" [ label = "'b'" ] ;
"2" - > "4" [ label = "'a','b'" ] ;
"3" - > "3" [ label = "'a'" ] ;
"3" - > "4" [ label = "'b'" ] ;
"4" - > "4" [ label = "'a','b'" ] ;
}

The HaLeX library includes a pretty-printing function, named toGraphViz that, given a FA, produces its (textual) graph representation in the Dot language. In order to ``beautify'' the graphical representation, we have included options to ignore the sink states and to fuse transitions with the same origin and destination.

4- Reactive Finite Automata

While accepting a sentence, we may wish the automaton to react. For example, we may wish to compute some trace information (for debugging purposes), or to perform some IO operations, or to compute semantic functions (in our example of real numbers, to convert the sequence of characters into its real value). A reactive finite automaton (RFA) is a machine that reacts while moving from one state to another. To add reactions to finite automata we use monads and we define monadic finite automata. Because we wish to associate different effects to finite automata, we parameterize the type of the DFA with the type of the monad. The reactions are defined in the (monadic) transition function. As a result, those functions not only have to indicate the destination state(s), but also the reactions to be performed in the transitions. Reactive deterministic finite automata are defined as follows:

data Dfa m st sy = Dfa [ sy ] -- Vocabulary
[ st ] -- Finite set of states
st -- The start state
[ st ] -- The set of final states
(st - > sy - > m st) -- The Monadic transition function



4.1- Communication protocol

We want to construct a program to implement the communication protocol between two electronic components. The communication is established by sending an initial sequence of bits 000 . Then, several 3-bit values are sent in order to configure different parameters of the components. Those values are separated by the special sequence of bits 001 . The communication ends after the sequence of bits 111 is sent. Consider also that to configure the components the program has to compute the integer values transmitted.

ptl :: Dfa (State ( [ Char ] , [ Int ] )) Int Char
ptl = Dfa [ '1','0' ] [ 1,2,3,4,5,6,7,8,9,10,11,12,13 ] 1 [ 12 ] delta
where delta 1 '0' = return 2
delta 2 '0' = return 3
delta 3 '0' = return 4
delta 4 s = do { bits s ; return 5 }
delta 5 s = do { bits s ; return 6 }
delta 6 s = do { bits s ; return 7 }
delta 7 '0' = do { values ; return 8 }
delta 7 '1' = do { values ; return 10 }
delta 8 '0' = return 9
delta 9 '1' = return 4
delta 10 '1' = return 11
delta 11 '1' = return 12
delta _ _ = return 13
bits x = modify (\ s - > ((fst s) + + [ x ] ,snd s))
values = modify (\ s - > ("",snd s + + [ bits2int (fst s) ] ))


5- HaLeX in Education

We started developing the HaLeX library in 2001 in the context of a third year course on programming methodology. This course has a working load of 24 hours of theoretical classes and another 24 hours of laboratory classes, running for 12 weeks (ie, a semester). The theoretical classes introduce the basic concepts of regular expressions, finite automata and context-free languages. HaLeX and HaGLr are used to support such classes. In the laboratory (a two hour class per week) the students have to solve exercises using a computer. We have defined eleven exercise sets (one per week), using literate Haskell, that the students have to complete. Each set of exercises defines a module of the HaLeX library. Thus, at the end of the course the students have a complete documentation of all the exercises and topics covered in the course, and, also, of the HaLeX library.

Furthermore, during the semester the students have to complete two assignment projects. The construction of parts of the HaLeX and HaGLr libraries have been proposed as such projects.

6- Where to get the software

The HaLeX system is public domain and it is available as a gzipped tar file at: HaLeX_1.1.tgz
7- Documentation

THe HaLeX API is available here here . This documentation is also being distributed with the tool.
8- Publications

  • Joćo Saraiva, HaLeX: A Haskell Library to Model, Manipulate and Animate Regular Languages , proceedings of the ACM Workshop on Functional and Declarative Programming in Education (FDPE/PLI'02), Pittsburgh, USA, October 2002. ps , abstract , bibentry



Pagina mantida por:

Joćo Saraiva

jas@di.uminho.pt
Page produced by a Tool generated by LRC from the XML document HaLeX.xml

Last Change on Thu Oct 13 14:53:08 2005