Perplexity
An Intrinsic Evaluation method that answers:
How uncertain is a model about the predictions it makes? Confident models often correlate with accurate models.
Surprisal → Entropy → Perplexity
The model is “as confused” as if it had to randomly choose between different units. This also means that the wors-case scenario is fixed by the Branching Factor.
Ressources
Artificial Intelligence 2
The perplexity of a sequence is
Intuition The reciprocal (Kehrwert) of probability, normalized by sequence length.
For a language with n characters or words and a language model that predicts that all are equally likely, the perplexity of any sequence is n. If some characters or words are more likely than others, and the model reflects that, then the perplexity of correct sequences will be less than n.