Prompt injection is a supply-chain problem

Most defenses against prompt injection still assume the attack surface ends at the model boundary. It doesn't. When an LLM-based system retrieves context from a vector store, calls external tools, and feeds results to downstream consumers, every link in that chain becomes a potential injection point. The mental model should not be "input sanitization" — it should be supply-chain security.

The retrieval layer is the first trust boundary

RAG pipelines retrieve documents and inject them into the prompt context. If an attacker can influence the content of those documents — through a compromised data source, a poisoned embedding, or a carefully crafted public-facing page that gets indexed — they control part of the prompt. This is not a hypothetical. We have observed it in three separate engagements this year, each time in an organization that had invested in output filtering but had no controls on retrieval provenance.

The fix is not to filter the retrieved content for known attack patterns. That approach fails for the same reason signature-based antivirus fails: the attacker adapts. Instead, treat retrieved context the way you treat third-party dependencies — pin it, hash it, monitor it for unexpected changes, and limit the permissions of the system that consumes it.

Tool calls extend the blast radius

When an agent can execute tools — send emails, query databases, modify records — prompt injection becomes remote code execution by another name. The injection does not need to exfiltrate data through the model's text output. It can instruct the model to take an action directly. Every tool binding is an implicit permission grant, and most agent frameworks grant those permissions without scoping them to the task at hand.

We recommend treating tool bindings the way you treat IAM policies: least privilege, scoped to the session, auditable, and revocable. If an agent has write access to a production database, that access should require explicit human approval — not a system prompt that says "only write when appropriate."

Downstream consumers inherit the risk

The output of one LLM call often becomes the input to another. Summarization feeds into reporting. Classification feeds into routing. If the first model's output is compromised, every downstream consumer is operating on tainted data. This is the supply-chain analogy in its purest form: a vulnerability anywhere in the pipeline can propagate to every system that depends on it.

The question is not "can our model be tricked?" The question is "what can an attacker reach if it is?"

We have developed a threat-modeling framework that maps these dependencies explicitly. For each component in an LLM pipeline, we identify the trust boundaries, the data flows, and the blast radius of a successful injection. The result is not a checklist — it is a dependency graph that integrates with the organization's existing SSDLC controls and incident-response playbooks.

If your organization runs LLMs in production, the attack surface is larger than the model. Treat prompt injection as a supply-chain problem, and the defenses start to look very different.