Foundations of AI and ML

Content

Lisam

Course materials

/

Section

Introduction to Word Representations

Introduction to Word Representations

In this lecture I introduce one of the key ideas in modern NLP: the idea that we can represent words as vectors and automatically learn these vector representations from data.

Lecture slides are available here.

Properties of word embeddings

Which of the following properties can be seen as an advantage of word embeddings over one-hot vectors?

Co-occurrence matrices

You are building a co-occurrence matrix for a language with a vocabulary of 100000 words. How many entries does the complete matrix have? Assume that all words in the vocabulary can be used both as target words (words that you want to get word embeddings for) and context words.

Semantic similarity

Which of the following words would you not expect to see as a close neighbour of the word life?

Cosine similarity

Here are two word vectors:

$\mathbf{a} = \begin{bmatrix} -2 & 2 \end{bmatrix}$

$\mathbf{b} = \begin{bmatrix} 1 & 0 \end{bmatrix}$

What is the cosine similarity of these two vectors, rounded to two decimals?

Exploring word embeddings

The Embedding Projector allows you to explore the vector spaces of pre-trained word embeddings. Load the default vector space (Word2Vec 10K) and search for the word ‘artificial’. What is the cosine distance of the nearest neighbour of this word in the original space?

(Cosine distance is defined as $1-c$ where $c$ is the cosine similarity.)

Training tasks

In the prediction-based approach to learning word embeddings, we obtain these embeddings as the by-product of learning to solve a training task. Which training task should you choose to let similar word vectors have similar meaning?

This webpage contains the course materials for the course TDDE56 Foundations of AI and Machine Learning.
The content is licensed under Creative Commons Attribution 4.0 International.
Copyright © 2022 Linköping University