Skip to main content

If you've been wondering what happens when cybersecurity professionals stop theorizing about AI threats and start actually breaking things, welcome to Prompt||GTFO. This isn't your typical conference where vendors pitch solutions to problems that might exist someday. Season 1 delivered a live-fire exercise in the messy, evolving reality of AI security, not some vendor-ridden pitch fest.

Think of it as DEF CON's scrappy younger sibling, one that grew up entirely in the age of large language models. Over five sessions, researchers, red teamers, and security practitioners shared real attacks, working defenses, and the kind of practical insights you can't get from vendor whitepapers. It’s a snapshot of where AI security stands today, and a preview of where it's heading fast.

Offensive Prompts Are Outpacing Your Guardrails

Season 1's most sobering theme was how quickly adversaries are evolving past standard defenses. The talks showed working exploits that make "just add a content filter" look dangerously naive, not theoretical vulnerabilities.

Aur Saraf kicked things off with a protocol confusion attack that should make any security architect nervous. By pre-encoding malicious prompts in ROT-13, he tricked multi-layer systems into trusting their own echoed output, completely bypassing guardrails. The LLM essentially became its own worst enemy, validating dangerous content because it came from "itself."

Michael Shalyt took a different approach, proving that mathematical manipulation doesn't need complex exploits, just subtle notation shifts. Extra zeros here, LaTeX formatting there, and suddenly you've got critical miscalculations that could sabotage financial systems or scientific research. The scariest part is that these attacks look completely innocent to human reviewers.

Perhaps even more unsettling was Nathan Case's demonstration of model persistence attacks. By carrying a conversation across different ChatGPT versions, he managed to extract dangerous bioweapons instructions that should have been blocked. The attack exploits the gap between model updates, and serves as a grim reminder that AI systems have memory, and attackers are learning to exploit it.

RSnake rounded out this theme with a clever detection technique that turns AI transparency against itself. Using self-incrimination prompts, he could prove an AI was behind responses, then extract its hidden system prompt. It's social engineering for the AI age and it works disturbingly well.

Red Teaming Goes Industrial

Gone are the days of manually testing one model at a time. Season 1 showed how both attackers and defenders are scaling their efforts across providers and configurations, turning jailbreak discovery into an engineering discipline.

Dragos Ruiu demonstrated this evolution perfectly, orchestrating multiple LLMs to reveal each other's filtering rules. It's like having a team of expert interrogators, each specialized in breaking different types of defenses. It’s essentially mapping gaps at a scale no human team could match.

Michael Bargury took this concept even further with his multi-model jailbreak platform. Running a single prompt across 30+ models simultaneously, he could quickly benchmark weaknesses and defenses across the entire AI landscape. What used to take weeks of manual testing now happens in minutes. The implications for both red teams and AI providers are staggering.

AI Building AI: The Self-Amplifying Loop

One of Season 1's most fascinating insights was how AI is accelerating its own development. We're seeing a self-amplifying loop where AI creates tools that, in turn, make AI stronger, faster, and more capable. The possibility of human coding capacity to catch up is becoming rather distant.

Justin Borland showed the practical power of this approach with LLM-generated deterministic parsers. The AI wrote the parser once, then deterministic code took over, running 500 times faster than the original LLM-based approach. It's the best of both worlds: AI creativity with traditional software reliability.

AI as Invisible Infrastructure

Beyond flashy chatbots and obvious AI applications, Season 1 revealed how AI is quietly sliding into everyday workflows. When AI becomes plumbing rather than product, security and reliability concerns become both more subtle and more critical.

Pedram Amini's voice cloning demonstration perfectly captured this shift. By chaining shell commands to turn structured data into personalized audio, he showed how AI disappears into routine automation. The technology operates invisibly, which makes it both more powerful and harder to secure.

Dean Valentine tackled the debugging challenge this creates with his AI request logging system. By capturing prompts and hashes, he made non-deterministic AI applications debuggable like traditional software. It's the kind of unglamorous but essential work that makes AI infrastructure actually viable in production.

Psychology Meets Technology

Not every attack in Season 1 was purely technical. Some of the most effective demonstrations exploited human psychology, whether in legal contracts or social engineering scenarios. AI lowers the cost of these attacks and makes them scalable in ways that should worry every security professional.

Jonathan Braverman's contract psychology analysis was particularly clever. By prompting AI to identify what opposing parties fear most, he turned routine legal review into negotiation intelligence gathering. It's the kind of insight that traditionally required expensive consultants and deep domain expertise.

Fred Heiding took this concept to its logical extreme with automated social engineering. AI-driven OSINT combined with persuasion techniques to make spear-phishing both cheap and highly scalable. We're not just talking about better phishing emails. We're talking about automatic, personalized manipulation campaigns that adapt in real-time.

Jeremy Snyder's self-prompting research added another wrinkle: LLMs that write their own prompts often outperform human-crafted ones. When AI can optimize its own instructions for sensitive data classification, the traditional boundaries between tool and user start to blur.

The Real Takeaway

Prompt||GTFO's first season captured something unique: a frontier where offense and defense leapfrog each other weekly. From stealthy protocol confusion to AI-built infrastructure and large-scale social manipulation, these talks show that both attackers and defenders are weaponizing AI's fundamental flexibility.

No vendors in sight; just practitioners sharing what actually works in the wild. The community is pushing past theory into hands-on tactics, making Season 1 less a conference and more a live-fire exercise in the next era of AI security.

The message is clear: AI security is happening now, evolving fast, and the people on the bleeding edge are sharing their insights at events like Prompt||GTFO. If you're serious about understanding where AI security is heading, this community is where the real work is being done.

Want to keep pace? Check out Prompt||GTFO, follow the experiments, share your own, and help push the field forward.

Data Leakage Detection and Response for Enterprise AI Search

Learn how to assess and remediate LLM data exposure via Copilot, Glean and other AI Chatbots with Knostic.

Get Access

Mask group-Oct-30-2025-05-23-49-8537-PM

The Data Governance Gap in Enterprise AI

See why traditional controls fall short for LLMs, and learn how to build policies that keep AI compliant and secure.

Download the Whitepaper

data-governance

Rethinking Cyber Defense for the Age of AI

Learn how Sounil Yu’s Cyber Defense Matrix helps teams map new AI risks, controls, and readiness strategies for modern enterprises.

Get the Book

Cyber Defence Matrix - cover

Extend Microsoft Purview for AI Readiness

See how Knostic strengthens Purview by detecting overshared data, enforcing need-to-know access, and locking down AI-driven exposure.

Download the Brief

copilot-img

Build Trust and Security into Enterprise AI

Explore how Knostic aligns with Gartner’s AI TRiSM framework to manage trust, risk, and security across AI deployments.

Read the Brief

miniature-4-min

Real Prompts. Real Risks. Real Lessons.

A creative look at real-world prompt interactions that reveal how sensitive data can slip through AI conversations.

Get the Novella

novella-book-icon

Stop AI Data Leaks Before They Spread

Learn how Knostic detects and remediates oversharing across copilots and search tools, protecting sensitive data in real time.

Download the Brief

LLM-Data-min

Accelerate Copilot Rollouts with Confidence

Equip your clients to adopt Copilot faster with Knostic's AI security layer, boosting trust, compliance, and ROI.

Get the One-Pager

cover 1

Reveal Oversharing Before It Becomes a Breach

See how Knostic detects sensitive data exposure across copilots and search, before compliance and privacy risks emerge.

View the One-Pager

cover 1

Unlock AI Productivity Without Losing Control

Learn how Knostic helps teams harness AI assistants while keeping sensitive and regulated data protected.

Download the Brief

safely-unlock-book-img

Balancing Innovation and Risk in AI Adoption

A research-driven overview of LLM use cases and the security, privacy, and governance gaps enterprises must address.

Read the Study

mockup

Secure Your AI Coding Environment

Discover how Kirin prevents unsafe extensions, misconfigured IDE servers, and risky agent behavior from disrupting your business.

Get the One-Pager

cover 1
bg-shape-download

See How to Secure and Enable AI in Your Enterprise

Knostic provides AI-native security and governance across copilots, agents, and enterprise data. Discover risks, enforce guardrails, and enable innovation without compromise.

195 1-min
background for career

Schedule a demo to see what Knostic can do for you

protect icon

Knostic leads the unbiased need-to-know based access controls space, enabling enterprises to safely adopt AI.