Preliminary release of a C/C++ parser for the CSP XML format version 2.0

##########################################################################
#WARNING: the current version of the solver is still in the early
#stages of its development. It is currently incomplete, untested and
#therefore necessarily buggy.
##########################################################################

The files in this archive are a first draft of a C++ parser for the
proposed XML format. A C interface to the C++ parser is also provided.

This parser uses the xerces-c library which must be installed on your
system

See http://xml.apache.org/xerces-c/

The proposed parser reads the XML file in SAX mode. That means that
each time an XML tag is read, the xerces-c library calls a function of
the parser. In turn, the parser will decode the XML information and
call a function of the solver to transmit the informations that were
read.

The advantage of the SAX mode is that it doesn't use much memory. Its
drawback is that it forces to read the information in the order of
the XML file. In contrast, the DOM model would allow reading the
informations in any order but would require to store the whole XML
file in memory before the solver can grab any information.


Here's how the parser interfaces with the solver. The author of a
solver writes an interface (a set of functions) which are called by
the parser when it decodes a piece of information from the XML
file. These interface functions are in charge of populating the data
structures used by the solver. They don't have to cope with the XML
details (only the parser does). This is in fact an instance of the
Builder Design Pattern
(http://www.developer.com/design/article.php/1502691)

The programmer simply calls the parser with the file to read. The
parser calls the interface functions that create the solver data
structures. Once the parser is done, the solver can start its quest of
a solution.

Let's take an example to give a clear idea of how it can work:

1) the solver calls the parser

2) the parser reads the XML file, chunk by chunk

3) when the parser reads the <instance> tag, it calls the solver
function beginInstance() with the name of the instance

4) when the parser reads the <domains> tag, it calls
beginDomainsSection() and announces the number of domains definition
to come

5) when the parser reads a <domain> tag, it calls
beginDomain(Name,nbValue) to indicate that the next calls will concern
domain Name and announces that it contains nbValue. The solver must
create a new domain named Name.

6) for each value v in this domain, it calls addDomainValue(v). The
solver is in charge of recording this value in its data structure for
the last created domain.

7) when all values in the domain have been transmistted to the solver,
the parser calls endDomain()

8) when all domains are read, the parser calls endDomainsSection()

and so on...

One important point is the way the parser communicates an expression
(in the definition of a predicate) to the solver.  As different
solvers may need different representations of the expressions, the
parser is in charge of reading the expression and translating it to a
number of different formats.

This means that the solver may ask to receive a string that represents
a C expression, or a string that represent an expression in prefix
notation, or postfix notation. The solver may even ask to receive the
expression as an abstract syntax tree. 

In short, the parser decodes the XML file, create an internal
representation of the expression as an asbtract syntax tree and
provide the solver with the most convenient representation.

Whenever an alphanumeric identifier is read (name of a domain,
variable, relation,...), a corresponding numerical identifier is
assigned by the parser. These numerical IDs start from 0 and are
assigned separately for domains, variables, relations.... Whenever the
parser communicates an alphanumeric identifier to the solver, it also
provides the numerical identifier.

In the example callbacks, this is displayed as VARNAME/NUMERICALID or
PARAMETERNAME/$POSITION


###
# Makefile
###

type make to build and execute the C++ and C examples

###
# A program to verify the solution given by a solver
###

CSPverify.cc 
	is the verifier program

CSPVerifierCallback.h
	is the callback used by the verifier

The basic use of the verifier program is to pipe the solution given by
the solver to the verifier program with a single command line
parameter which is the name of the instance file. Example:

cat $HOME/solution | CSPverifier $HOME/instance.xml

The verifier will output OK if the solution satisfies all
constraints. It will print ERROR with the first unsatisfied constraint
otherwise.
 
###
# C++ interface and examples
###

CSPParserCallback.h
  is the declaration of the functions that the
  parser will call to communicate informations to the solver

SampleCallback.h
  is an example of callback that simply prints the informations that are read

main.cc
  is an example main program

###
# C interface and examples
###

C_CSPParserCallback.h 
  is the declaration of the functions that the
  parser will call to communicate informations to the solver

  see CSPParserCallback.h for the meaning of each callback

C_XMLParser.cc 
  is the parser module. It must be compiled by a C++ compiler

C_SampleCallback.c
  is an example of callback that simply prints the functions that are
  called with their parameters

C_main.c
  is an example of a main program


###
# Comments for MAC users:
###

  From Chris Jefferson <chris.jefferson@comlab.ox.ac.uk>

  The way I built it was:

  Install fink and get it working ( http://fink.sourceforge.net/ )
  type "fink install fink install xerces-c xerces-c-dev" to get packages.
  Add "CFLAGS= -I/sw/include -L/sw/lib" to the Makefile.