M. Olteanu, and D. Moldovan. 2005. PP-attachment disambiguation using large context. In Proceedings of HLT/EMNLP 2005, Vancouver, Canada.
[pdf]
2004
M. Olteanu. 2004. Prepositional Phrase Attachment Ambiguity Resolution Through A Rich Syntactic, Lexical And Semantic Set Of Features Applied In Support Vector Machines Learner. Master's Thesis.
[pdf]
Abstract:
Prepositional Phrase Attachment is a common source of ambiguity in natural language. The
best results reported in the literature to solve PP-attachment ambiguity are based on Machine
Learning reaching accuracy rates of up to 88.1%. These systems limit the context captured by
the features to no more than the 4 major elements that define a simple ambiguity case (verb
versus noun attachment): verb, noun before PP, preposition, noun in PP. This thesis proposes
to solve PP-attachment ambiguity with a Machine Learning approach using Support Vector
Machines as a learner; the feature set contains complex features extracted from a candidate
syntax tree, which can be generated by automatic parsing; some of these features were
proven efficient for semantic information labeling. The feature set also includes features
based on unsupervised information obtained from a very large corpus (World Wide Web);
features containing manually annotated semantic information about the verb and about the
objects of the verb have also been used. The accuracy of the developed system on a Penn
Treebank-II dataset is 93.62%; the accuracy of the system on the FrameNet dataset is 91.79%
when no manually-annotated semantic information is provided and 92.85% when semantic
information is provided.
D. Moldovan, R. Grju, M. Olteanu, and O. Fortu. 2004. SVM classification of FrameNet semantic roles. In Proceedings of Senseval-3 - Semantic Roles Task. Barcelona, Spain.
[pdf]
R. Grju, A. Giuglea, M. Olteanu, O. Fortu, O. Bolohan and D. Moldovan. 2004. Support Vector Machines Applied to the Classification of Semantic Relations in Nominalized Noun Phrases. In Proceedings of the HLT/NAACL Workshop on Computational Lexical Semantics. Boston, MA.
[pdf]
Software and data
Phramer - An Open-Source Statistical Phrase-Based MT Decoder
Datasets for "PP-attachment disambiguation using large context" by Marian Olteanu and Dan Moldovan (2005).