phi-4 is the only one I am aware of that was deliberately trained to refuse instead of hallucinating. it’s mindblowing to me that that isn’t standard. everyone is trying to maximize benchmarks at all cost.
I wonder if diffusion LLMs will be lower in hallucinations, since they inherently have error correction built into their inference process
If that’s their solution, then they have absolutely no understanding of the systems they’re using.
ChatGPT isn’t prone to hallucination because it’s ChatGPT, it’s prone because it’s an LLM. That’s a fundamental problem common to all LLMs
phi-4 is the only one I am aware of that was deliberately trained to refuse instead of hallucinating. it’s mindblowing to me that that isn’t standard. everyone is trying to maximize benchmarks at all cost.
I wonder if diffusion LLMs will be lower in hallucinations, since they inherently have error correction built into their inference process
Even that won’t be truly effective. It’s all marketing, at this point.
The problem of hallucination really is fundamental to the technology. If there’s a way to prevent it, it won’t be as simple as training it differently