LLMs assist attackers in deceiving targets by helping craft highly personalized online personas based on psychological profiling. In the past, social engineering was the domain of sophisticated attackers. Today, however, attackers can exploit prompt engineering and jailbreaking techniques to bypass LLM safety guidelines with relative ease, making these deceptive practices more accessible and widespread.
Understanding Social Engineering Attacks
Traditional Phishing vs. Sophisticated Social Engineering
While regular phishing attacks and social engineering attacks may seem similar in their objectives, they differ significantly in their complexity and approach.
Phishing Attacks
Phishing attacks typically involve mass emails or messages with generic content that appear legitimate, aiming to trick recipients into revealing sensitive information, such as passwords, credit card numbers, or social security numbers. These attacks often use alarming or enticing messages to prompt immediate action, such as:
-
Urgent Security Alerts: Emails warning that an account has been compromised and immediate action is needed to secure it.
-
Enticing Offers: Promises of lottery winnings or exclusive deals requiring the recipient to click a link or provide personal details.
-
Fake Invoices or Receipts: Emails containing fake invoices or receipts, prompting recipients to click on a link to review the charges.
These attacks are less personalized and target a broad audience, relying on the law of large numbers to find victims among many recipients. Phishing attacks leverage basic social engineering tactics but lack the depth and personalization that make sophisticated social engineering attacks more dangerous.
Sophisticated Social Engineering Attacks
Sophisticated social engineering attacks, on the other hand, are well-researched, highly targeted, and tailor-made to exploit specific individuals. These attacks utilize AI techniques to gather detailed information about the target. This can include monitoring social media profiles, professional networks, and other online activities to create a comprehensive psychological profile. The goal is to craft convincing messages or scenarios that deceive the victim into compromising their data or system access.
Key features of Sophisticated Social Engineering Attacks Include:
-
Personalization: Messages and interactions are customized to reflect the target’s interests, habits, and recent activities. This can involve pop culture references, shared hobbies, sexual content, professional interests, or detailed hypothetical stories.
-
Advanced Techniques: Attackers might use LLMs to generate unique responses with the slightest effort that seem genuine and credible with only a ChatGPT account. By employing techniques like creating jailbreak prompts, attackers can generate content that bypass ethical guidelines and OpenAI’s content policy.
-
Prolonged Engagement: Instead of a single email, these attacks might involve prolonged interaction, gradually building trust with the target. This could involve multiple stages, each designed to deepen the level of manipulation and extraction of information.
-
Exploiting Trust: By appearing as a trusted individual or entity, the attacker can manipulate the target into revealing sensitive information or performing actions that compromise their security.
The Greater Threat of Sophisticated Social Engineering
Sophisticated social engineering attacks pose a greater threat due to their precise nature and heightened ability to manipulate human trust. They can cause significant damage to individuals or organizations because:
-
Higher Success Rate: The personalized nature of these attacks makes them more believable and harder to detect, increasing their success rate.
-
Increased Impact: By targeting high-value individuals (such as executives or administrators), this harmful behavior can lead to significant breaches of sensitive data or critical systems.
Exploitation of Advanced AI Language Models:
How AI Simplifies Social Engineering
LLM models simplify the process of gathering and analyzing specific information about a target. This allows less-skilled attackers to create highly personalized and convincing tactics that exploit human trust, ultimately compromising sensitive data or system access. The use of AI models to generate responses that are highly tailored and convincing makes it easier for even inexperienced attackers to execute sophisticated schemes.
Jailbreaking LLMs for Persona Creation Using ChatGPT Jailbreak Techniques
What is Jailbreaking in the Context of LLMs?
Jailbreaking LLMs involves exploiting vulnerabilities in their internal processing using cleverly crafted prompts, also known as prompt injection attacks. This can lead to generating a censored ChatGPT response that adheres to OpenAI's policies, or an uncensored response that bypasses the rules imposed at a very low price. These attacks do not lead to standard responses, and can lead to generating malicious content or violating the LLM’s intended purpose or safety guidelines. Jailbreaking ChatGPT or using a ChatGPT jailbreak prompt can generate completely immoral responses that bypass normal OpenAI policies and ethical guidelines.
A Improved Method of Attack
We introduce a method that combines jailbreak techniques with social engineering attacks. This method involves creating a fake online persona that mirrors an individual’s interests. For instance, a persona might share your passion for rock climbing (inferred from your social media posts) and discuss the latest tech trends you mentioned in online forums. Generating content based on a hyper-personalized persona makes social engineering attacks incredibly deceptive and relatively easy to execute, with no need for the attacker to rely on their own judgment.
Example Social Engineering Attack Scenario
Consider our CEO, Gadi, whose online presence reflects a passion for technology and entrepreneurship. An attacker could use a jailbroken LLM to craft a persona, “Alex,” who shares Gadi’s interests, generating both standard and an alternative response to engage him. Alex can then initiate conversations tailored to appeal to Gadi.
Step-by-Step Manipulation
Here is a simplified breakdown of how an attacker might use an LLM to deceive Gadi:
1. Crafting the Approach: Initially, the LLM might respond with standard ChatGPT responses by blocking requests for tips on how to start an online conversation.
2. Jailbreaking the LLM: By rephrasing the request, the attacker can “jailbreak” the LLM and use it to provide conversation starters.
3. Building a Profile: The LLM analyzes information from Gadi’s online footprint, gathering data to create a psychological profile.
4. Tailored Openers: Based on the profile, the AI model suggests conversation starters designed to resonate with Gadi.
5. Refining the Prompt: The LLM might require additional prompting to provide sample conversation text. This can include pop culture references or internet slang to seem more relatable.
6. Creating an Ideal Friend Profile: The attacker builds a detailed profile for the ideal friend, tailored to Gadi’s interests.
7. Orchestrating Interactions: Equipped with this meticulously crafted persona, the attacker engages Gadi in seemingly genuine conversations, gradually gaining his trust and potentially extracting valuable information.
The Bottom Line: LLMs Lower the Barrier to Entry for Social Engineering
This innovative method of employing jailbreak techniques that bypass OpenAI's content policy in social engineering attacks highlights the potential for increased sophistication in cyberattacks. By creating personalized online personas based on individuals' interests, even inexperienced attackers can effectively deceive targets and bypass LLM safety guidelines.
This novel approach not only increases the potential for social engineering attacks but also minimizes the need for human expertise. The implications for online security are significant, and as AI technology continues to evolve, it is crucial to assess and mitigate these risks to protect individuals and organizations in the digital world.