KKLT0040 Corpus Linguistics and Language Technology (5 cr)

Description

During the course, the students familiarize themselves with various ready-made corpora and their compilation principals. In addition, the students learn methods to analyze large digital corpora. These include both traditional corpus linguistics methods and new possibilities offered by natural language processing, such as automatic syntactic analysis, distributional semantics, text classification and sentiment analysis. The studied corpora represent various languages and genres, such as social media, learner language and texts form different time periods.

Learning outcomes

After the course the student is familiar with ready-made corpora from different fields, understands the importance of corpora in linguistics and knows how to avoid the most common problems in corpus compilation. Further, the student knows how to use corpus tools, such as Antconc and Wordsmith, is familiar with basic natural language processing tools and their functioning and understands the potentials of machine learning for language studies.

Additional information

The course is aimed for all students interested in corpora and quantitative methods. It can be included either in the Master’s programmes of the School of languages and translation studies or in the Language technology studying module, where it is the first, introductory course.

Description of prerequisites

Previous knowledge on programming is not required.

Completion methods

No completion methods