Ask this question chatGPT – and it starts rattling like a shaken pinball machine…

And here is part of the explanation
if the model wants to output the word “hello”, it needs to construct a residual similar to the vector for the “hello” output token that the lm_head can turn into the hello token id. and if the model wants to output a seahorse emoji, it needs to construct a residual similar to the vector for the seahorse emoji output token(s) – which in theory could be any arbitrary value, but in practice is seahorse + emoji, word2vec style.
the only problem is the seahorse emoji doesn’t exist! so when this seahorse + emoji residual hits the lm_head, it does its dot product over all the vectors, and the sampler picks the closest token – a fish emoji.
As a relief – this my most recent seahorse image taken at the Musée océanographique de Monaco last week.
