7.4 Routes without Roots
With trees, we could find any node simply by enumerating the paths
from the root. We can now do the same for graphs in a way which makes
us immune from loops, but we still must assume that there is at least
one root.
With general graphs we are not so fortunate. There may
be many pairs of nodes in a graph between which there is no
path.4
So the various implementations of the path predicate are
not yet as much help to us as they could be.
The path problem introduced us to ways of dealing with
loops, but we still have no general-purpose way to be sure that we
have visited every node in a graph.
Suppose that we are given a graph G. We can nondeterministically
select a node N from G, and begin a traversal of G from
N. Unless we are very lucky, both in our choice of G (it is
connected and rooted) and of N (it is one of G's roots), a
traversal of G starting from N will not reach all of
G's
nodes. However, we can iterate the procedure on the remaining
unvisited nodes, biting off a chunk of G at each iteration,
until we
have finally visited all of G's nodes.
The definition of df_traversal/2
on page X is
almost all we need. In particular, the subroutine dft/4
is
already set up to return, in its third argument, whatever remains of
the graph when it is done. In the previous definition we simply threw
away whatever remained of the graph after the top level call to
dft/4
from df_traversal/2
. So the only modification we
need is to replace the old `one shot' definition of
df_traversal/2
with a new recursive one, which calls
itself repeatedly until the remainder of the graph is empty.
Also, note that instead of starting with the first node in the graph,
we call select/3 to choose a starting point
nondeterministically. Since we cannot guarantee that our graph has
roots, we cannot guarantee that its first element is such a root. So
as far as we know one starting point is as good as another. (Actually,
some starting points are much better than others, but there is no easy
way to know in advance which those are...) This means that this
predicate has many more solutions than the rooted version. We can
restrict it to return only one solution by always choosing the first
node in the graph as our starting place. Finally, the call to
append/3 in the recursive clause of df_traversal/2
inserts an `*' to mark the points at which a new
``pseudo-root'' was selected. This divides the output into what we
might think of as ``depth-first-reachable components'', approximately.
df_traversal( [], [] ).
df_traversal( G0, [N|Route] ) :-
select( N/Ns, G0, G1 ), % select N as a starting point
% G0 = [N/Ns|G1], <- fewer solutions
dft( Ns, G1, G, Route1 ), % search from N's neighbors
df_traversal( G, Route2 ),
append(Route1,[*|Route2],Route).
dft( [], G, G, [] ).
dft( [N|Rest], G0, G, [N|Route] ) :-
select( N/Ns, G0, G1 ), % N is unvisited; mark it visited
dft( Ns, G1, G2, RouteL ), % go down from N
dft( Rest, G2, G, RouteR ), % now go across
append(RouteL,RouteR,Route).
dft( [N|Rest], G0, G, Route ) :-
nonmember_( N, G0 ), % N was visited
dft( Rest, G0, G, Route ). % so ignore it and keep going
As in the various path predicates, we use the graph-reduction
technique to mark nodes as visited. So we use select/3 to access the
graph data structure, which simultaneously checks for membership in
the graph and deletes the requested element if it is a member (and of
course failing if the requested item is not an element of the
graph).
This predicate also seves as another illustration of the
accumulator pair technique. The careful reader will have noted
that we always speak of ``the'' data structure representing the graph,
when in fact we use a multitude of different variables to represent
it. For example, in clause 2 of dft/4, ``the'' graph
appears as the value of G0, G1, G2 and
G,
each of which
are variables with possibly quite distinct values. For now it is
probably best to understand accumulator pairs in a procedural
manner. From this point of view one can see these chains of
graph-valued variables as tracking the evolution of the graph over
time. This ``evolutionary'' point of view is considered procedural
rather than logical because the introduction of a notion of time
brings with it a notion of order: ``first it looks like this, then we
change it and it looks like that.'' But the logical interpretation of
a clause body is a conjunction of sub-goals, and conjunction is
commutative, and so order-free. In the next chapter we will consider
the logical semantics of accumulator pairs, and discover thereby some
of the most powerful techniques of logic programming. But for now, one
should just try to follow how each successive ``G-variable'' marks a
phase in the evolution (here ``evolution'' = ``reduction'') of the
graph.
The general structure of df_trav/2
is essentially the same as
any other simple recursive predicate. In particular, since graphs are
implemented as lists of a certain sort and the depth-first search task
is essentially a universal task requiring that the entire graph be
traversed, it is completely unsurprising that the base case is the
case where the graph is empty. The recursion here is not, however,
exactly the familiar recursion-on-a-structure. In particular, each
step of the recursion does reduce the structure, but the nature of
this reduction bears no relation to any set of generating functions
which might be used to build up the set of graphs out of primitive
elements. Abstractly, we are treating graphs as essentially being
built up out of ``depth-first-reachable'' components, sets of nodes
which are reachable by a depth-first traversal. Each call to
dft/3 removes one of these components (a set of nodes,
together with all of the arcs leading out of those nodes), so each
recursive call to df_trav/2
does reduce the size of the graph
structure, so the recursion is ``well founded'' and the program will
terminate.
However, once again,
the recursion is on an abstract structure, an interpretation of the
graph as being made up of ``depth-first-reachable components'' which
is not directly related to any feature of the actual Prolog data
structure we use to implement and describe the graph.
7.4.1 Other search orders
Consider again the code for dft/3
, in particular clause 2.
dft( [], G, G ).
dft( [N|Rest], G0, G ) :-
select( N/Ns, G0, G1 ), % N is unvisited; mark it visited
dft( Ns, G1, G2 ), % go down from N
dft( Rest, G2, G ). % now go across
dft( [N|Rest], G0, G, Ts ) :-
nonmember_( N, G0 ), % N was visited
dft( Rest, G0, G, Ts ). % so ignore it and keep going
Clause 2 is the clause that does most of the actual work. If we were
to apply this general algorithm to a particular problem (as we will in
the next subsection) this is the clause where we would do the actual
computing that adds
information to the output variable(s), since this is the clause in which we
have just accessed a brand new unvisited node and its neighbors.
This clause is doubly recursive, calling itself twice on different
inputs. This is to be expected if we consider graphs to be
generalizations of trees, since the main computational feature of
trees is the double recursion (``down'' and ``sideways'') which they
require. The only wrinkle is that the double recursion is a call to
the very same predicate, but this is not a computationally very
significant difference. But now we ask ourselves: does it make a
difference in which order we call these two recursions? Well, yes and
no. We still visit all the reachable nodes, but we visit them in a
very different order.
Now consider the following slight modification of
dft/3
,
clause 2:
dft_b( [N|Rest], G0, G ) :-
select( N/Ns, G0, G1 ), % N is unvisited; mark it visited
append( Ns, Rest, Agenda ),
dft_b( Agenda, G1, G ).
This is equivalent to the old version. But in this version, the choice
of a breadth-first vs. depth-first search order can be made with an
almost trivial modification, namely we replace
append( Ns, Rest, Agenda )
with
append( Rest, Ns, Agenda )
We think of the list Agenda as ordering the tasks which our
program has to perform. If we treat Agenda as a
stack, that
is, always adding things at the front, and removing them from the
front, then we get the depth-first version illustrated here. On the
other hand, computer science offers us an alternative to stacks called
queues, which are sequences where we remove things from the
front, as with stacks, but we add things at the back. This
means that, as a new task is added to the back of the queue, it will
not be selected for processing until all the tasks that were already
in the queue when it was added have been selected for processing. So,
the first version of
append/3---append( Ns, Rest, Agenda)
---adds
new material (Ns) to the
front, treating Agenda
as a stack, while the second
version---append( Rest, Ns, Agenda )
---adds the new material to
the back, treating Agenda as a queue. We could in fact
substitute for the terms ``depth-first'' and ``breadth-first'' the
terms ``stack-based'' and ``queue-based''.
Note that there is nothing sacred about stacks and queues.
There are many more ways than these to order a collection of tasks.
In
particular, both strategies always select for processing the first
(leftmost) item. This is reflected here in that we unify Agenda
with [N|Rest]
, selecting its head, N, for processing. Consider
yet another version of this basic logic:
dft_c( OldAgenda, G0, G ) :-
choose( N, OldAgenda, Rest ),
select( N/Ns, G0, G1 ),
append( Ns, Rest, NewAgenda ),
dft_c( NewAgenda, G1, G ).
Here we have replaced the extraction of N from the agenda by means
of unification with [N|Rest]
with an explicit call to a
yet-to-be-defined
predicate choose/3. Our current version of graph traversal would
simply implement choose/3 as follows.
choose( N, [N|Rest], Rest ).
However, we could make our choice of N arbitrarily sophisticated, by
modifying the definition of choose/3.
choose( N, List, Rest ) :-
select( N, List, Rest ),
okay( N ).
The above clause implements a version of choose/3 which
nondeterministically selects an N which is okay, whatever we
decide that that should mean.
The last version of depth first traversal
which we will consider here returns a depth-first search forest,
instead of a list. This
gives us a more informative picture of how the graph was traversed. In
particular, it tells us how we got to the nodes we visited. The idea
is this. Depth-first search does not visit any node twice. Therefore,
for any node, we can provide a unique sequence of arcs which were used
to reach it. This must be true because if the sequence of arcs were
not unique, then there would have to be some node which was accessed
twice, which we have explicitly avoided doing; so we derive a
contradiction. Therefore, for each starting place that we select in
df_trav/2
, there is a unique path to every node which was visited
on the corresponding call to dft/3. This is sufficient to
show that the set of nodes which are ``reachable'' from
some node N form a tree: the set is rooted (in N) and every member
of the set has a unique path from the root. In addition we have that
the set forms an ordered tree, since for every node in the tree its
immediate subtrees are visited in some definite order. So each
starting node defines a tree, so a sequence of starting nodes defines a sequence
of trees. This sequence is a depth-first search forest. It is
unique for every choice of sequence of starting nodes, but note that
that choice is not unique for a given graph, since it is determined by calls to the
nondeterministic predicate select/3. So any given graph can have
many different search forests.5
df_trees( [], [] ).
df_trees( G0, [T1|Ts] ) :-
select( N/Ns, G0, G1 ),
T1 = N/Ts1,
df_trees_aux( Ns, G1, G, Ts1 ),
df_trees( G, Ts ).
df_trees_aux( [], G, G, [] ).
df_trees_aux( [N|Rest], G0, G, [T|RestTs] ) :-
select( N/Ns, G0, G1 ),
T = N/SubTs,
df_trees_aux( Ns, G1, G2, SubTs ),
df_trees_aux( Rest, G2, G, RestTs ).
df_trees_aux( [N|Rest], G0, G, Ts ) :-
nonmember_( N, G0 ),
df_trees_aux( Rest, G0, G, Ts ).
/*