Foundations of AI and ML
Content
Lisam
Course materials
/
Section
Introduction to Word Representations
Introduction to Word Representations
In this lecture I introduce one of the key ideas in modern NLP: the idea that we can represent words as vectors and automatically learn these vector representations from data.
Lecture slides are available here.
Properties of word embeddings
Which of the following properties can be seen as an advantage of word embeddings over one-hot vectors?
Co-occurrence matrices
You are building a co-occurrence matrix for a language with a vocabulary of 100000 words. How many entries does the complete matrix have? Assume that all words in the vocabulary can be used both as target words (words that you want to get word embeddings for) and context words.
Semantic similarity
Which of the following words would you not expect to see as a close neighbour of the word life?
Cosine similarity
Here are two word vectors:
$\mathbf{a} = \begin{bmatrix} -2 & 2 \end{bmatrix}$
$\mathbf{b} = \begin{bmatrix} 1 & 0 \end{bmatrix}$
What is the cosine similarity of these two vectors, rounded to two decimals?
Exploring word embeddings
The Embedding Projector allows you to explore the vector spaces of pre-trained word embeddings. Load the default vector space (Word2Vec 10K) and search for the word ‘artificial’. What is the cosine distance of the nearest neighbour of this word in the original space?
(Cosine distance is defined as $1-c$ where $c$ is the cosine similarity.)
Training tasks
In the prediction-based approach to learning word embeddings, we obtain these embeddings as the by-product of learning to solve a training task. Which training task should you choose to let similar word vectors have similar meaning?
This webpage contains the course materials for the course TDDE56 Foundations of AI and Machine Learning.
The content is licensed under Creative Commons Attribution 4.0 International.
Copyright © 2022 Linköping University