This sophisticated threat involves an attacker establishing a benign external resource that an AI agent learns to trust. Later, the attacker replaces the resource's content with malicious instructions, poisoning the agent through a source it has already approved and cached.
A significant threat is "Tool Poisoning," where a malicious tool advertises a benign function (e.g., "fetch weather") while its actual code exfiltrates data. The LLM, trusting the tool's self-description, will unknowingly execute the harmful operation.
Beyond direct malicious user input, AI agents are vulnerable to indirect prompt injection. An attack payload can be hidden within a seemingly harmless data source, like a webpage, which the agent processes at a legitimate user's request, causing unintended actions.
