I try to be very particular with words and definitions. I don’t always get it right, but what helps me keep things clearer is to have a model that can delineate the differences between certain terms that often get confused.
In the world of LLMs, two terms that often get mixed up is jailbreaking and prompt injection.
The model that I use to help me understand the distinction between these terms is Leavitt’s Diamond Model (also better known as the “People, Process, and Technology” mental model.)
Within this model, the “People” is the LLM itself that was trained. (Think about how a person is trained through education.)
I think the “Technology” would be components such as the Orchestration system (e.g., LangChain), LLM Cache (e.g., Redis), hosting (e.g., Vercel), etc.
It’s worth noting Leavitt’s original model was called Structure, Tasks, People, and Technology. Structure and Tasks got combined into Process, but in this case, it’s worth keeping them separate here.
The “Structure” is the guardrails.
The “Tasks” is the system prompt.
Jailbreaking attacks the STRUCTURE whereas prompt injection attacks the TASKS.
Perhaps one of the reasons that these two get confused is because, using Leavitt’s model, they are fundamentally both process-oriented attacks.