Frank Morawietz
Seminar für Sprachwissenschaft
Theoretische Computerlinguistik
Neuphilologische Fakultät
Universität Tübingen

Uwe Mönnich, Frank Morawietz and Stephan Kepser

A Regular Query for Context-Sensitive Relations

9pp., PS (336kb), PDF (127kb); Slides of the talk given by St. Kepser in Philadelphia PDF (334kb).


One of the fundamental problems when defining a query language for databases consists in finding a balance between the desiderata of a sufficiently large expressive power on the one hand and an adequate computability of queries on the other. This problem occurs of course also with linguistic treebanks, the prototype of non-relational semistructured databases. There are many linguistic phenomena which can be adequately annotated by using trees, and for which there exists a powerful yet decidable query language, namely monadic second order logic (MSO). But on the other hand there exist linguistic phenomena such as cross-serial dependencies in Swiss German which cannot be described with context-free means and for which therefore MSO is not expressive enough as a query language. Instead of going over to a more expressive query language and losing decidability on the way we propose to employ a two-level approach, which has proven successful in handling mildly context-sensitive phenomena before.

The two-level approach consists of a lifting step in which the (grammar of) the treebank and the MSO query is lifted to the free Lawvere-algebra where a coding of mildly context-sensitive relations within the realm of MSO logic is possible. This step allows to filter out all undesirable query results and retrieve only the relevant ones. In the second step, the returned answer trees are retranslated into the original trees of the treebank. By using techniques from automata theory in both steps we can ensure that the query language remains decidable.

In case of problems or for comments, please contact:
Last updated: 11-Jan-2002