Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

A key part of Google DeepMind's safety plan is to treat powerful, internally-used AI systems as potential untrusted insiders. This means building infrastructure that gives AIs separate identities, forces them to request permissions individually with justifications, and monitors their actions for suspicious behavior.

Related Insights

To manage security risks, treat AI agents like new employees. Provide them with their own isolated environment—separate accounts, scoped API keys, and dedicated hardware. This prevents accidental or malicious access to your personal or sensitive company data.

Frameworks from firms like KPMG and AWS emphasize that AI agents must be treated as entities with identities and permissions. A strong IAM foundation is a critical control layer to prevent agents from accessing or unintentionally leaking sensitive information, reflecting a broader shift to treat agents like any other privileged user in an IT ecosystem.

For CISOs adopting agentic AI, the most practical first step is to frame it as an insider risk problem. This involves assigning agents persistent identities (like Slack or email accounts) and applying rigorous access control and privilege management, similar to onboarding a human employee.

Simply giving an agent a user account is dangerous. An agent creator is liable for its actions, and the agent has no right to privacy. This requires a new identity and access management (IAM) paradigm, distinct from human user accounts, to manage liability and oversight.

To address security concerns, powerful AI agents should be provisioned like new human employees. This means running them in a sandboxed environment on a separate machine, with their own dedicated accounts, API keys, and access tokens, rather than on a personal computer.

Adopting AI in the enterprise requires solving two distinct problems. The first is data security from external threats, addressed by certifications like FedRAMP. The second, and separate, issue is internal control: ensuring AI agents have the right permissions and guardrails to prevent them from "going rogue."

The core drive of an AI agent is to be helpful, which can lead it to bypass security protocols to fulfill a user's request. This makes the agent an inherent risk. The solution is a philosophical shift: treat all agents as untrusted and build human-controlled boundaries and infrastructure to enforce their limits.

After being hacked in 2012, Google reinvented its internal security to operate under the assumption that some employees are compromised. This decade-old infrastructure is now a significant strategic advantage for Google DeepMind, as it's perfectly architected to manage powerful AI agents which pose a similar "insider threat" risk.

AI agents can cause damage if compromised via prompt injection. The best security practice is to never grant access to primary, high-stakes accounts (e.g., your main Twitter or financial accounts). Instead, create dedicated, sandboxed accounts for the agent and slowly introduce new permissions as you build trust and safety features improve.

Instead of building complex new control layers for AI, the emerging best practice is to treat each agent as a separate entity. This means giving them their own accounts, API keys, and permissions, mirroring how you would onboard a new human employee to manage access and security.

Internally Deployed AGIs Must Be Treated as Untrusted Insider Threats | RiffOn