Misleading data poses a threat in sensitive areas such as investment management or credit granting
What would happen if an artificial intelligence (AI) system designed to predict the future of the stock market and trained in a period of economic stability faced an imminent crisis? If it were not trained to recognize signals like this, it might interpret a small increase in trading as a sign of continued growth. The model might also wrongly predict that stock prices will rise, with serious consequences for the market.
If an AI tool that analyzes financial market sentiment based on news and social media posts receives inadequate training, it may misinterpret expressions or context, which could lead to analysis that does not represent the true opinion of the market, leading to investment decisions based on distorted information.
These examples show that as AI becomes more prevalent in the financial sector, it not only opens up avenues for innovation and automation, but also challenges such as so-called AI “hallucinations,” a term that refers to situations in which AI models generate and disseminate false or misleading information.
In the fintech world, AI is here to stay: it was valued at $1.12 billion in 2023, and its rapid growth rates suggest it will reach $4.37 billion by 2027, according to estimates from Market.us. However, according to analysis by startup Vectara, the “hallucination” rate of chatbots ranges from 3% to 27%, which becomes a problem for the financial sector, where accurate decisions are crucial.
Julián Colombo, CEO and founder of N5, says AI can present hallucinations that refer to errors or incorrect interpretations of data, which “leads to erroneous conclusions.” Julio Blanco, co-founder and CBO of Zentricx, clarifies that basically “the result is an invention of the model and is not supported by real information.”
Large natural language models (LLMs) – explains Weslley Rosalem, senior AI leader at Red Hat – work based on conditional probabilities learned from training data. “They generate the next word or token based on the probability distributions of these sequences. Hallucinations occur when the model produces results that are statistically plausible but do not correspond to factual reality. These models capture statistical relationships, but do not have a true understanding of the content,” he explains.
In the spotlight
In the financial sector, these hallucinations can occur in several areas, such as credit analysis, where a model can assign a customer “a risk profile that does not reflect their true financial situation, potentially resulting in inappropriate credit granting decisions”, says Colombo.
Blanco adds that in the case of customer service, there may be search engines for consultation (instead of “frequently asked questions”) and the search engine may make incorrect recommendations about services or their costs. In turn, they may be so delusional that they do not resolve customers’ doubts at all.” Models can also hallucinate when generating financial reports, if they perform complex calculations to estimate or predict trends: “More than predicting, they would be guessing a future without any real basis,” he points out.
In automated financial advice, hallucinations can recommend inappropriate investment strategies based on faulty data or algorithms. They can also cause problems in fraud detection and risk management. “Hallucinations can lead to false positives or negatives, compromising the effectiveness of identifying fraudulent activity or assessing risk,” says Rosalem.
The nature of these industry hallucinations can lead to significant financial losses, reputational damage for institutions and customer dissatisfaction. “In addition, decisions based on flawed analyses can increase the risk of fraud or regulatory non-compliance, exposing companies to regulatory sanctions. It is crucial to implement validation and monitoring measures to ensure that AI systems operate accurately and transparently, thus minimizing the associated risks,” emphasizes Colombo.
Similarly, hallucinations can lead to inefficient decision-making. “Hallucinations can compromise the quality of strategic decisions, affecting the institution’s competitiveness in the market,” Red Hat adds.
Minimize the risk
Zentricx says that the main way to minimize hallucination is to ensure that the information used is reliable. “If the model is given false information, it learns to repeat the same falsehoods. “We always recommend a data consulting project before developing a complex AI model.”
Regarding data quality, Blanco highlights that it is a “central” point to reduce hallucinations. “It is necessary to ensure that AI models are trained with diverse, balanced and well-structured data. Also perform stress tests on the AI model”, he highlights.
Red Hat suggests that strategies like RAG (Retrieval-Augmented Generation) or RIG (Retrieval Interleaved Generation) minimize the effects of hallucinations on LLMs, since they “combine language models with information retrieval systems.” The LLM is fed with specific information retrieved from relevant databases or documents, which allows the model to generate more accurate and up-to-date answers and reduces the exclusive reliance on training data, which may be outdated or incomplete.
Open source methods and tools such as TrustyAI and guardrails can be implemented with large language models (LLMs) to mitigate hallucinations and improve reliability.
TrustyAI is a suite of tools that aims to improve the explainability and trustworthiness of AI models, providing capabilities to interpret model decisions, identify biases, and monitor performance. “By applying TrustyAI to LLMs, it is possible to better understand how the model generates responses and identify possible hallucinations or misinformation,” says Rosalem.
Guardrails, on the other hand, are mechanisms that impose restrictions or checks on the outputs of AI models. They can be implemented to ensure that responses fall within a certain scope, follow specific policies, or are factually correct.
Colombo further adds the need to implement human reviews “for critical data and sensitive responses can increase accuracy, especially in areas such as risk and compliance.”
At N5, they developed the Fin Sky solution, which combines two effective approaches. First, they adopted a distributed model, which uses multiple AIs working together. They implemented a feedback process, where they continuously validate the input, processing and output of each user query. “This allowed us to reduce the hallucination rate to 0.3%, compared to the rate of 3% to 27% observed in chatbots, according to data from the startup Vectara”, explains Colombo, concluding by highlighting that their AIs are trained with data exclusive to the institution, avoiding queries to random information on the internet, which further increases the accuracy of the responses. “This combination of methods ensures that the solutions are reliable and secure, considering a sector where information plays a critical role.”