Quantifying Meaning
Time: Tue 2023-01-17 09.00
Location: F3, Lindstedtsvägen 26 & 28, Stockholm
Video link: https://kth-se.zoom.us/j/66943302856
Language: English
Subject area: Computer Science
Doctoral student: Amaru Cuba Gyllensten , Beräkningsvetenskap och beräkningsteknik (CST)
Opponent: Professor Hinrich Schütze, Centrum für Informations- und Sprachverarbeitung, Ludwig-Maximilians-Universität, München, Germany
Supervisor: Adjunct Professor Anders Holst, Beräkningsvetenskap och beräkningsteknik (CST)
QC 20221207
Abstract
Distributional semantic models are a class of machine learning models with the aim of constructing representations that capture the semantics, i.e. meaning, of objects that carry meaning in a data-driven fashion. This thesis is particularly concerned with the construction of semantic representations of words, an endeavour that has a long history in computational linguistics, and that has seen dramatic developments in recent years.
The primary research objective of this thesis is to explore the limits and applications of distributional semantic models of words, i.e. word embeddings. In particular, it explores the relation between model and embedding semantics, i.e. how model design influences what our embeddings encode, how to reason about embeddings, and how properties of the model can be exploited to extract novel information from embeddings. Concretely, we introduce topologically aware neighborhood queries that enrich the information gained from neighborhood queries on distributional semantic models, conditioned similarity queries (and models enabling them), concept extraction from distributional semantic models, applications of embedding models in the realm of political science, as well as a thorough evaluation of a broad range of distributional semantic models.