Machine Learning Interviews

This video talks about the evaluation metric BERTScore, why it needed over existing metrics such as the BLEU score and so on and how it is computed and evaluated. Traditional metrics look at exact text match. BERTScore looks at semantic similarity leveraging contextual word embeddings of words in the candidate and the reference sentences.

Suppose you are modeling text with a HMM, What is the complexity of finding most the probable sequence of tags or states from a sequence of text using brute force algorithm?

Multi Head Attention : Transformer Architecture

Scaled Dot Product Attention

Skip or Residual Connections in Deep Networks

GPT Model

The BERT Score – Evaluating Text Generation

BERT Model

The Transformer Architecture

Why are Transformers So Successful?

Batch vs Mini-Batch vs Stochastic Gradient Descent

What is Layer Normalization