|
Cass NP recogniser
A NP recogniser collects words that constitute noune phrases. For example, "Den sorte kats mindste killing har en meget tyk mave." → [NP1 Or to illustrate more clearly: NP[Den sorte kats mindste killing] har NP[en meget tyk mave]
Noun phrases (NP's) in a text function typically as subject and object, so by identifying these and also the verbs, one obtains a gross analysis of the sentence. But NP-recognition can also be used in e.g. information retrieval. Especially the relation between compound words and their synonyms can be relevant. For example byrådsmedlem vs. medlem af byrådet. CST's NP recogniser is implemented in Cass, a finite-state chunck parser. The system is basically language independent, but the NP grammar is modelled on NP's found in the Danish Parole corpus. The grammar identifies simple NP's ranging from the start of the NP to the its kernel. Relative clauses and coordinations of NP's are not found, but proper names in postposition and the first preposition syntagma after the kernel are recognised on an experimental basis. More informationReport
about the NP recogniser used in information retrieval (Danish) User guide to the Danish Parole corpus (Danish) Read more about content-based information retrieval in Ontoquery
and about the relation between NP's and compound words in the VID project (Danish). Contact: Dorte Haltrup Hansen |
|
Emil Holms Kanal 2, building 22, 3, DK-2300 Copenhagen S
|