Curriculum Vitae - Bart Jongejan

Personal data:

Name: Bart Jongejan, nationality: Dutch.

Born in Hellendoorn, the Netherlands, 13 July 1953. Married in 1992, 2 children.

Home address: Cæciliavej 31, 2500 Valby, Denmark.


Utrecht University 1971-1981: doctorandus (drs.) in physics.

Ordina 1983: programming and system analysis.

Exin 1984-1986: five IT courses.

IBM 1991: introduction AIX.

SuperUsers 1991: Motif, Oracle SQL. 1993: O-O design. 2002: UNIX Systemadministration Grundkursus. 2003: Linux Systemadministration Grundkursus, Linux netværk, Linux Sikkerhed, Linux Administration Advanced.

Skills picked up during employment:

Programming: PL/1, C, Snobol, dBase, C++, Delphi, CORBA, Interbase, Java, PHP, Javascript, XML, HTML, UNIX, LINUX, Windows, DOS, pattern matching algorithms

Other: validation of Dutch corpora


1979-1981: Student-assistant at Utrecht University, didactics of physics.

·         Research in formation of physical concepts in children.

1981-1982: Physics and mathematics teacher at Werkplaats Kindergemeenschap, Bilthoven (Netherlands)

1984-1986: PL/1 mainframe programmer at insurance company AMEV.

·         Maintenance of administrative software

·         Development of successful name parser.

1986-1990: PC software development at Utrecht University, faculty of humanities.

Selected projects:

·         Graphical text collator.

·         First version of the Iconclass browser.

1990-1996: PC and Unix software development at CRI A/S (as of 1996: WM-Data), Birkerød.

Selected projects:

·         Text Memory (early TM system).

·         Re-implementation of Constraint Grammar Parser in C++. (SIMPR - ESPRIT Project 2083)

·         Development of multi-threaded kernel, integration of software, technical co-ordination. (KAVAS-2 - AIM A2019)

1997- now: Software development at Center for Sprogteknologi, Copenhagen.

Selected projects:

·         Maintenance of word database software for the Danish Language Council.

·         Software adaptation of Dutch spell-checker to Danish. (Scarrie)

·         Development of repetitiveness checker and text comparison tool. (TransRouter)

·         Integration of PC-based user interface and UNIX-based MT-system using CORBA. (Otelo)

·         Integration of speech and gesture recognition software into CST's Njal parser and construction of communication manager. (Staging)

·         Construction of Translatability Checker. (TQPro)

·         Construction of a lemmatiser/stemmer for Danish. (STO)

·         Investigation of commercially available AECMA-compliant software for supporting Controlled English, creation of text corpora (3) automated analysis of the corpora. (VID)

·         Porting term-databases from Unify under HP-UX to MySQL under Linux. (Patrans)

·         Design and development of affix-based lemmatiser, trained for over 10 languages. (Tvärsök)

·         Anvil Plug-in: automatic generation of annotations for head movements - velocity/acceleration and direction. (Nomco)

·         Decomposition of words in morphemes on Windows Mobile platform. (Melfo: Reading aid for dyslexics.)

·         Automatic composition (based on user’s wishes as to output) and execution of web-based tool chains. (DK-Clarin)


Reviewed publications

Underwood, N.L. &  B. Jongejan: Profiling Translation Projects - An Essential Part of Routing Translations, in Proceedings of the 8th International Conference on Theoretical and Methodological Issues in Machine Translation, Chester, England, August 1999.

Paggio, P. , B. Jongejan and C.B. Madsen (2000):  Unification-based Multimodal Analysis in a 3D Virtual World: the Staging project. In: Proceedings of the CELE-Twente Workshop on Language Technology: Interacting Agents, pp. 71-82.

Paggio, P. and B. Jongejan (2000): Representing multimodal input in a unification-based system: the Staging project, In: Proceedings of the Workshop on Integrating Information from Different Channels in Multi-Media Contexts at ESSLLI, pp. 24-31.

Underwood, N.L. & B. Jongejan: Translatability Checker: A Tool to Help Decide Whether to Use MT. In: Proceedings of MT Summit VIII, Santiago de Compostela, Spain, September 18th –22nd 2001, p. 363-368.

Paggio, P. & B. Jongejan: Multimodal Communication in the Virtual Farm of the Staging Project. In: Proceedings of the workshop on Information Presentation and Natural Multimodal Dialogue 2001. Verona, December 2001, p. 41-45.

Patrizia Paggio and Bart Jongejan (2005): Multimodal Communication in Virtual Environments Communicating with the Staging virtual farm. In O. Stock and M. Zancanaro (eds) Multimodal Intelligent Information Presentation, Kluwer Academic Publishers, pp.27-47. ISBN 1-4020-3051-7.

H. Dalianis, B. Jongejan: Hand-crafted versus Machine-learned Inflectional Rules: The Euroling-SiteSeeker Stemmer and CST's Lemmatiser. In Proceedings of LREC 2006, Genova, May 2006, pp. 663 – 666

Karlgren, J, Dalianis, H and Jongejan, B: Experiments to Investigate the Connection between Case Distribution and Topical Relevance of Search Terms in an Information Retrieval Setting. In Proceedings of LREC 2008, Marrakech, May 2008

Jongejan, Bart and Dalianis, Hercules: Automatic training of lemmatization rules that handle morphological changes in pre-, in- and suffixes alike. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Suntec, Singapore : Association for Computational Linguistics, 2009. pp. 145-153

Roux, J, Scholtz, P, Klop, D , Povlsen, C , Jongejan, B & Magnusdottir, AO 2010, Incorporating Speech Synthesis in the Development of a Mobile Platform for E-learning. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation: LREC 2010 European language resources distribution agency, Valletta, Malta.

Jongejan, B. Automatic face tracking in Anvil. In: Proceedings of LREC 2010, Malta: LREC Workshop on Multimodal Corpora, 2010, pp.154-156.

Bart Jongejan, Automatic annotation of head velocity and acceleration in Anvil, In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey, 23-25 May 2012

Bart Jongejan. Workflow Management in CLARIN-DK, In: Proceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013, May 22-24, 2013, Oslo, Norway. NEALT Proceedings Series 20;article=002


Selected Project Reports

Bart Jongejan (1992) CGP++, the C++ equivalent of CGP. Programmer's manual. (SIMPR) CRI A/S

Bart Jongejan (1993) Integration Cook Book. (KAVAS-2, MON 2 activity)

Bart Jongejan (1994) Design of a mechanism for control of the learning process within the KAVIAR tool. (KAVAS-2, MON 2.1)

Bart Jongejan (1995) Programmer's reference guide to the KAVIAR Controller. (KAVAS-2, MON 6 activity)

Calder, Jo, Bart Jongejan, Margaret King, Jürgen Reischer, Johannes Ritzke and Nancy Underwood (editor) (1999). TransRouter Component Tools and Profiles, TransRouter deliverable D3.2 (D3.1). 54 pp.

Lina Henriksen, Bart Jongejan, Bente Maegaard (2003): Kontrolleret sprog - Indledende analyse af virksomhedernes regelsæt og sammenligning med eksisterende regelsæt, VID-rapport nr.1

Bart Jongejan, Bolette S. Pedersen, Costanza Navarretta (2004): Automatisk analyse af Zaccos og Ankiros tekstmateriale, VID-rapport nr. 3

Bart Jongejan (2005): A comparison of HyperSE and Nordea's and Bang & Olufsen's requirements, VID-report nr. 5

Other publications

Henriksen, L., B. Jongejan, B. Maegaard: Controlled Language promoting readability and Tone-of-Voice. In: Forståeleg språk for alle, Rapport frå ein nordisk konferanse om klarspråk, p. 82-88. Nordens Språkråd 2005.

Jongejan, B. (2006). CST’s lemmatiser for dansk. In Sprogteknologi i dansk perspektiv, Reitzel, København, pp 370-390. ISBN 87-7876-459-9