If someone says Mumbai is the capital of India or one kilometre and one mile are of the same length, we will laugh at that person. If someone tries to pass off Newton’s third law as – For every action, there will not be any equal and opposite reaction, we will scoff at him.
But in the world of Generative AI, all this fictitious and wrongful information will find its way into websites and search engines. This could make gullible users of Generative AI solutions such as ChatGPT and Bard.ai believe whatever content is produced by them.
Cybersecurity experts are calling this – Data poisoning, which is injecting malicious inputs into the training engines. This can significantly corrupt the learning process, Neelesh Kripalani, Chief Technology Officer, Clover Infotech, has said.
Engineers that are building the LLM (Large Language Models) platforms such as ChatGPT feed enormous amounts of data into the system and train it to understand the content and reproduce answers based on the questions asked by users.
The quality of the output is completely dependent on the input and injection of false or fictitious data corrupts the model. This could adversely impact the output.
“A robust data validation pipeline, combined with meticulous dataset curation, is the best armour against this threat,” he said.
Talking about the possible dangers in the indiscriminate use of Generative AI models, he said that malicious actors can create realistic synthetic identities using these services. “The attackers can use its ability to create synthetic identities and use it for fraudulent activities. As a guardian of digital identity, CIO’s counter measures should involve continuous monitoring of user behavior patterns, coupled with adaptive authentication mechanisms,” he said.
Deepfake Amplification
Deepfakes (AI-generated images and videos that mimic real-life personalities) pose a grave risk to organisational reputations. “Detecting manipulated media in real-time requires advanced image and video analysis tools, along with AI-driven media authenticity verification systems,” he said.
“Organisations must put in place a process within the AI models to identify and remove deepfakes from the asset library,” he said.
He said phishing (similar-looking email ids and websites) campaigns are getting more sophisticated after the advent of Generative AI solutions. “Machine learning-driven anomaly detection systems are our allies here, enabling us to spot anomalous patterns in communication, protecting the employees and stakeholders from phishing scams,” he said.
Kripalani said Generative AI can inadvertently leak sensitive information when generating responses or content. “To address this challenge, we can use a mix of AI-driven content validation algorithms and policy-driven content filters, ensuring that only appropriate content is shared,” he said.