A true story
In 2024 an airline lost a court case because its chatbot invented a bereavement refund policy that the airline did not actually offer. The customer was awarded what the bot had promised. The chatbot wasn't drunk. It was confidently retrieving plausible-sounding information that wasn't in any document the airline had indexed.
This kind of failure is not exotic. It is the default failure mode of any LLM-backed chatbot that treats its own output as authoritative. And it is almost entirely solved by one feature: citations.



