Some of the datasets used in the paper PP-attachment disambiguation using large context by Marian Olteanu and Dan Moldovan
The content of the datasets is copyrighted, but you have the right to use and modify them for any purpose, commercial or non-commercial.
We makes no warranties - explicit or implied - regarding the content of the files or any issue related to them, including but not limited to quality and suitabilty. Use on your own risk.
Please contact the authors if you want to distribute.
The following compressed files contain comma-separated files (.names, .data, .test) and sparse SVM numeric files for the following models described in the paper:
Also, this file identifies the entities on which features were based for TB2 datasets: parse tree identifier in Treebank, v, n1, p, n2 identifiers in the parse tree and also identifiers for the verb phrase, np1 and the prepositional phrase.
If you have any questions, please contact Marian Olteanu