de.uni_tuebingen.sfb.lichtenstein.treebanks
Class TreebankConverter

java.lang.Object
  extended by de.uni_tuebingen.sfb.lichtenstein.treebanks.TreebankConverter

public class TreebankConverter
extends Object

Transfer trees in NEGRA 3 from TueBaD to MONA format TODO: use Tigerxml

Forces connectedness: no disconnected components, binary branching, Ignore secondary edges


Field Summary
static String NEGRA_EXPORT_FILE_ENDING
          The file ending for the treebank file in NEGRA export format.
static String OBJECT_FILE_ENDING
          The file ending for the file which contains the binary trees in Java serialized form.
 
Constructor Summary
TreebankConverter()
           
 
Method Summary
static void convert(File corpFile, File destDir)
          Convert a corpus file into binary trees.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NEGRA_EXPORT_FILE_ENDING

public static final String NEGRA_EXPORT_FILE_ENDING
The file ending for the treebank file in NEGRA export format.

See Also:
Constant Field Values

OBJECT_FILE_ENDING

public static final String OBJECT_FILE_ENDING
The file ending for the file which contains the binary trees in Java serialized form.

See Also:
Constant Field Values
Constructor Detail

TreebankConverter

public TreebankConverter()
Method Detail

convert

public static void convert(File corpFile,
                           File destDir)
                    throws IOException,
                           FormatException
Convert a corpus file into binary trees. Every tree in the corpus is converted into a binary tree and written as an object file in the directory where the corpus file is.

Parameters:
corpFile - The corpus file in NEGRA export format.
destDir - The directory where the converted corpus should be saved.
Throws:
IOException - [CAN] Either the given file is corrupted, could not be read or the binary object file could not be written.
FormatException - [CAN] If the format could not be detected, or this is not a NEGRA corpus.


© Copyright 2008 Hendrik Maryns   Creative Commons License