The DatAptor project is a technology-oriented project supported by STW in cooperation with industrial partners including Translation Automation User Society (TAUS) and some of its member organizations and companies. The DatAptor project aims at research and development for advanced statistical machine translation (SMT) systems that adapt to a new domain automatically using algorithms for extracting and weighting suitable training instances in an industry-scale, multi-domain parallel corpus. The project addresses various challenges including statistical weighting of parallel data instances to better fit user-supplied example documents, SMT adaptation by-example to new domains, and extensions of hierarchical and syntactic SMT systems. The PhD student is expected to work among others on the latter topic, i.e., hierarchical (and syntax-enriched) SMT systems. Tasks The PhD student is expected to: complete and defend a PhD thesis within the official appointment duration of four years; regularly present intermediate research results at international conferences and workshops, and publish them in proceedings and journals; collaborate with the other DatAptor team members (two postdocs, a programmer and the PI) on developing and testing SMT systems according to the DatAptor project plan; collaborate with the researchers in other relevant parts of ILLC, particularly within the Language and Computation research theme; participate in the organization of research activities and events at ILLC, such as conferences, workshops and joint publications; assist in teaching activities of ILLC.
The candidate should have: -a Master’s degree with excellent grades in a relevant field, including Statistical Natural Language Processing, Statistical Computational Linguistics, Machine Learning or other related branches of Artificial Intelligence; -suitable background in parsing formal grammars, probabilistic modelling and statistical learning; -experience in programming and affinity with empirical work involving corpora; -good academic writing and presentation skills; good social and organizational skills; -strong interest in a scientific career.