Tokeniser |
Name recogniser | POS-tagger
| Lemmatiser |
NP-recogniser |
Repetitiveness checker |
Keywords | Multi word terms
Combination of tools
Summarisation tool |
Computational Lexicon for Danish | Anonymisation
CST's online demos
This page links to demo versions of our language technological tools, grammars and
word bases.
Here you can see were you need
language technological applications.
- Tokeniser
-
CST's tokeniser creates segments containing one sentence each and divides these
into tokens - words, numbers and punctuation.
- Name recogniser
-
CST's named enity recogniser demarcates and classifies proper nouns in a text.
- POS-tagger
-
The POS-tagger automatically assigns word class information to each word in a text,
whether it is a noun, a verb, etc.
- Lemmatiser
-
CST's lemmatiser reduces each word form in a text to the word's lemma form, the
base form. The lemma form is the expression you would use to do a look up in a dictionary.
- NP-recogniser
-
The Cass NP-chunker demarcates simple noun phrases.
- Repetitiveness checker
-
The program finds repetitions of word groups that somehow stand out, using a statistical
method. E.g. in a EU-related text: on the basis of or the high contracting
parties.
- Keywords
-
This program extracts 20 keywords characterising an input text.
- Multi word terms
-
The program finds the most relevant adjective + noun combination among the words
in the text.
- Test a combination of tools
-
The aforementioned tools can work together in different constellations.
- Summarisation tool
-
The summarisation tool (DanSum) can be used
for automatic summarisation of Danish newspaper and text documents.
- Computational Lexicon for Danish
-
User interface to STO.
- Anonymisation
-
IDentification and ANonymisation of NAmes (IDANNA).
|