More or less that. There's a point during the path that the input is taking on the language model were the induced randomness can significantly affect the output or not. If all the weights are pointing to the same end node, because the "confidence" is high, the no matter the random seed, the output will be the same. When the seed greatly affect the final result is because the weights don't point with that confidence to an unique end node, so the small randomness introduced at the beginning (the seed to say so) greatly change the result. It is here were you are most likely to get an hallucination.
To put again in terms of the much more easier to view earlier neural networks. When you didn't trail the model enough mario just made random movements without doing attempts to complete the level. Because the weights of the neurons could not reliably take the input and transform into an useful output. It os something that could be solved in smaller models. For larger models gets incredibly complicated because the massive amount of data. The complexity of the data. And the complexity of a proper training. But it's not something imposible or that could not get rid of. The same you can get Mario to finally complete all levels every time without issues, you can get a non hallucinanting chat bot, it just takes more technology improvements.
I suppose it could be said that the nature of language is chaotic like weather and not deterministic like a Mario level, and thus it would be actually "impossible" to get large results, like it's impossible to get precise weather a month in advance. But I'm not sure there would be enough evidence to support that, as hallucinations are not just across the board, they just tend to happen on matters that had little training data. Matters with plenty of training data do not hallucinate even in today models.
I searched slm online and found out that small models you said. I wasn't refering to those. Those are just small large language models IMO if that makes any sense. A proper slm should also have a small purpose, cannot be general chat. I mostly refer to the current chatbots that point you to predefined answers, or summarizing ones. Nothing that could really elaborate a wrote answer word by word.
Currently and to my knowledge. There isn't any general language model that can just write up answers and that is good enough to not hallucinate. But certainly we are getting closer each year.
Edit: I've been looking for an example, here https://www.tax.service.gov.uk/ask-hmrc/chat/self-assessment These kind of chatbots, they know when their answer is not precise and default to a polite "ask again" answer instead of just tell you the first "hallucination" that came to them. They are powered by similar AI technology but it's not a general use and cannot write word by word. But it "knows" when te answer is precise or not.
Like taking your phone and go through your WhatsApp messages?
If that's a concern you could set up a password to access any sensible app or chat within that app.
I think that is a more sensible approach. As if you are targeted by any reason an undercover cop could get a hold on your unlocked phone by many different ways.