Understanding the causes of hallucinations in large language models

Автор:

Анотація: (English) Hallucinations in large language models (LLMs) are a systemic problem that manifests itself when models generate information that does not correspond to the ground truth or input data. This phenomenon significantly limits the application of LLMs in mission-critical domains such as medicine, law, research, and journalism, where the accuracy and reliability of information are of utmost importance. This paper provides a comprehensive analysis of three key factors that contribute to hallucinations: issues related to the quality and structure of training data; architectural features of transformer models that predispose them to error accumulation; and the lack of built-in fact-checking mechanisms, due to which models rely solely on statistical regularities. Each of these factors is discussed in detail using relevant research, and potential solutions are proposed. The paper includes three dedicated graphs that visualize the relationship between various model parameters and the occurrence of hallucinations. The results of the study indicate the need for a comprehensive approach to improving LLM, including both improving data preprocessing methods and modifying the model architecture and introducing additional verification mechanisms.

Бібліографічний опис статті:

. Understanding the causes of hallucinations in large language models//Наука онлайн: Міжнародний електронний науковий журнал - 2024. - №9. - https://nauka-online.com/publications/information-technology/2024/9/03-35/

Стаття опублікована у: : Наука Онлайн No9 сентябрь 2024

Вибачте цей текст доступний тільки в “англійська”.

Перегляди: 14

Коментарі закрито.

To comment on the article - you need to download the candidate degree and / or doctor of Science

Підготуйте

наукову статтю на актуальну тему, відповідно до роздлів журналу

Відправте

наукову статтю на e-mail: editor@inter-nauka.com

Читайте

Вашу статтю на сайті нашого журналу та отримайте сертифікат